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Abstract 

We consider a general class of statistical mechanical models of coherent struc- 
tures in turbulence, which includes models of two-dimensional fluid motion, quasi- 
geostrophic flows, and dispersive waves. First, large deviation principles are proved 
for the canonical ensemble and the micro canonical ensemble. For each ensemble 
the set of equilibrium macrostates is defined as the set on which the corresponding 
rate function attains its minimum of 0. We then present complete equivalence and 
nonequivalence results at the level of equilibrium macrostates for the two ensembles. 

Microcanonical equilibrium macrostates are characterized as the solutions of a 
certain constrained minimization problem, while canonical equilibrium macrostates 
are characterized as the solutions of an unconstrained minimization problem in 
which the constraint in the first problem is replaced by a Lagrange multiplier. The 
analysis of equivalence and nonequivalence of ensembles reduces to the following 
question in global optimization. What are the relationships between the set of 
solutions of the constrained minimization problem that characterizes microcanonical 
equilibrium macrostates and the set of solutions of the unconstrained minimization 
problem that characterizes canonical equilibrium macrostates? 

In general terms, our main result is that a necessary and sufficient condition for 
equivalence of ensembles to hold at the level of equilibrium macrostates is that it 
holds at the level of thermodynamic functions, which is the case if and only if the 
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microcanonical entropy is concave. The necessity of this condition is new and has 
the following striking formulation. If the microcanonical entropy is not concave at 
some value of its argument, then the ensembles are nonequivalent in the sense that 
the corresponding set of microcanonical equilibrium macrostates is disjoint from 
any set of canonical equilibrium macrostates. We point out a number of models of 
physical interest in which nonconcave microcanonical entropies arise. 

We also introduce a new class of ensembles called mixed ensembles, obtained by 
treating a subset of the dynamical invariants canonically and the complementary 
set microcanonically. Such ensembles arise naturally in applications where there 
are several independent dynamical invariants, including models of dispersive waves 
for the nonlinear Schrodinger equation. Complete equivalence and nonequivalence 
results are presented at the level of equilibrium macrostates for the pure canonical, 
the pure microcanonical, and the mixed ensembles. 
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1 Introduction 



1.1 Overview 

A wide variety of complex physical systems described by nonlinear partial differential 
equations exhibit asymptotic phenomena that are much too complicated to study by 
purely analytic methods. In order to gain a fuller understanding of such phenomena, ana- 
lytic methods are supplemented by numerical simulations or the systems are modeled via 
the formalism of statistical mechanics, which often yields uncannily accurate predictions 
concerning the system's asymptotic behavior. 

An important class of complex physical systems for which the formalism of statis- 
tical mechanics provides accurate predictions arises in the study of turbulence; e.g., 
two-dimensional fluid motions, quasi-geostrophic flows, two-dimensional magnetoffuids, 
plasmas, and dispersive waves. In each case important features of the asymptotic behav- 
ior of the underlying nonlinear partial differential equation — the two-dimensional Euler 
equations, the quasi-geostrophic potential vorticity equation, the magnetohydrodynamic 
equations, the Vlasov-Poisson equation, and the nonlinear Schrodinger equation — can be 
effectively captured in a statistical mechanical model. A distinguishing feature of such sys- 
tems is that a free evolution from a generic initial condition exhibits a separation-of-scales 
behavior: coherent structures are formed on large scales — e.g., vortices and shears in the 
case of fluid motion or solitons in the case of dispersive waves — while random fluctuations 
are generated on small scales. A major goal of any description of the system, whether 
analytic, numeric, or statistical, is to predict the formation, interaction, and persistence 
of such coherent structures. 

The purpose of the present paper is to provide the theoretical basis for statistical 
mechanical studies of specific models of turbulence that are analyzed elsewhere. These 



include two-dimensional fluids ||, quasi-geostrophic flows [[16]], and dispersive waves [T? 
In each case the model is defined on a fixed flow domain in terms of a sequence of finite- 
dimensional systems indexed by n e IN. Coherent structures are studied in the continuum 
limit, obtained by sending n — > oo. They are characterized by variational principles, the 
solutions of which define equilibrium macrostates. In contrast to the detailed descrip- 
tion required by the associated nonlinear partial differential equation and by the finite- 
dimensional systems that discretize them, these equilibrium macrostates provide a vastly 
contracted description. The variational principles are derived and analyzed via the theory 
of large deviations and duality theory for concave functions. 

In these models the sequence of finite-dimensional systems is defined on a fixed domain 
in terms of a long-range interaction with a local mean-field scaling. In order to obtain 
a nontrivial limit, one must scale the inverse temperature by a parameter tending to 
infinity. By altering the scaling and making other superficial changes, our results can 
also be applied to classical lattice models such as the Ising model of a ferromagnet. Such 
models are typically defined in terms of the thermodynamic limit of a sequence of finite- 
dimensional systems having a finite-range or summable interaction. In such applications 
a basic stochastic process that arises in the large deviation analysis is the empirical field, 
which has been studied by a number of authors including W\ SO, EHI, E01. Other papers 
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that investigate the equivalence of ensembles in the traditional thermodynamic or bulk 
limit include |]] and [|7j . 



There is a large literature on the equivalence of ensembles for classical lattice systems 



and related models. It is reviewed in part in the introduction to [Q, to which the reader 
is referred for references. In particular, a number of papers including [12, 21, 32, IB] 
investigate the equivalence of ensembles using the theory of large deviations. Of these 
papers, considers the problem in the greatest generality, obtaining a criterion for 
the equivalence of ensembles in terms of the vanishing of the specific information gain 
of a sequence of conditioned measures with respect to a sequence of tilted measures. 
However, despite the mathematical sophistication of these and other studies, none of them 
explicitly addresses the general issue of the nonequivalence of ensembles, which seems to 
be the typical behavior for the models of turbulence that the present paper analyzes. In 
|32| , §7.3] and |33|, §7] there is a discussion of the nonequivalence of ensembles for the 
simplest mean-field model in statistical mechanics; namely, the Curie- Weiss model of a 
ferromagnet. For a general class of local mean-field models of turbulence, the present 
paper addresses this and related issues. 

In much of the classical literature on statistical mechanical approaches to two-dimensional 
turbulence, it is tacitly assumed that the microcanonical and canonical ensembles give 
equivalent results |K| [39[| . Recently, however, in the context of the point vortex and 
related models, this tacit assumption has been directly addressed. Questions concerning 
the equivalence and nonequivalence of ensembles for these models have been investigated 



by a number of authors, including M, 19, 26, 28]. The present paper, inspired in part by 



[191 ) is the first to present complete and definitive results for a general class of models, 
with a particular emphasis upon the nonequivalence of ensembles. 

An unexpected connection of our work in this paper is to dynamic stability analysis. 
To date, all studies of the nonlinear stability of two-dimensional flows have been carried 
out using the Lyapunov functionals introduced by Arnold [Q, |3], When these deter- 
ministic results are reformulated in the setting of statistical mechanical models, they can 
be expressed in terms of the second-order conditions satisfied by canonical equilibrium 
macrostates. In the cases when the microcanonical entropy is not concave and thus the en- 
sembles are nonequivalent, the Arnold sufficient conditions for nonlinear stability are not 
satisfied by the microcanonical equilibrium macrostates. Nevertheless, the second-order 
conditions satisfied by these macrostates allow us to refine the classical Arnold theorems 



by proving the nonlinear stability of a new class of two-dimensional flows. In |16| these 
ideas are developed for the quasi-geostrophic potential vorticity equation, which describes 
the dynamics of rotating, shallow water systems in nearly geostrophic balance. The work 
in that paper has possible applications to the stability of planetary flows; specifically, to 
the stability of zonal shear flows and embedded vortices in Jovian-type atmospheres. 

In the next two subsections we present an overview of the main results in this paper, 
stripped of all technicalities. This is done in the context of a well-known statistical me- 
chanical model of the two-dimensional Euler equations known as the Miller-Robert model. 
Results formulated in great generality to apply to this and other models of turbulence are 
given in Sections 2-5 of this paper. We start by presenting large deviation principles with 
respect to the canonical ensemble and the microcanonical ensemble. For each ensemble 
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we then define the set of equilibrium macrostates as the set on which the associated rate 
function attains its minimum of 0. A fundamental question arises. Are the two ensem- 
bles equivalent at the level of equilibrium macrostates? That is, does each equilibrium 
macrostate with respect to one ensemble correspond to an equilibrium macrostate with 
respect to the other ensemble? In Section 4, definitive and sharp results on the equivalence 
and nonequivalence of the ensembles are presented. 

In general terms, our main result is that a necessary and sufficient condition for the 
equivalence of ensembles to hold at the level of equilibrium macrostates is that it holds 
at the level of thermodynamic functions. In proving this, we go beyond the important 
work in [[3^], which proves that for a general class of models including the classical lat- 
tice gas thermodynamic equivalence of ensembles is a sufficient condition for macrostate 
equivalence of ensembles. Our proof that thermodynamic equivalence is also a necessary 
condition for macrostate equivalence is perhaps the most striking discovery of our work. 
Specifically, we show that whenever a quantity known as the microcanonical entropy is not 
concave, the ensembles are nonequivalent in the sense that the set of microcanonical equi- 
librium macrostates is richer than the set of canonical equilibrium macrostates. In fact, 
the latter set contains none of the microcanonical equilibrium macrostates corresponding 
to nonconcave portions of the entropy [see Thm. |4.5| (b)1. Useful, but less concrete, con- 
nections between the nonconcavity of the microcanonical entropy and nonequivalence of 
ensembles can also be deduced from the abstract results in |32| [see their §5 and §6]. On 



the other hand, our results are formulated in order to apply directly to statistical me- 
chanical models of turbulence for which nonconcave microcanonical entropies frequently 
and naturally arise, particularly in physically interesting regions corresponding to a range 
of negative temperatures. Several such examples are mentioned in Section 1.4. 

Besides the results on equivalence and nonequivalence of ensembles, we also prove that 
for the Miller-Robert model and other models microcanonical equilibrium macrostates 
have an equivalent characterization in terms of constrained maximum entropy principles 
(see Remark |3.4| ). Our approach to this question seems simpler and more intuitive than 
the approach taken in |37], ^2], fl3| . The derivation of constrained maximum entropy 
principles based on the microcanonical ensemble brings to fruition the work begun in 
||, where unconstrained maximum entropy principles based on the canonical ensemble 
are derived. Our proof that microcanonical equilibrium macrostates are characterized as 
solutions of constrained maximum entropy principles is an important contribution because 
such principles are the basis for numerical computations of equilibrium macrostates and 
coherent structures for the Miller-Robert model and other models |T4], |ST], |52 . 



In systems having multiple conserved quantities, one also has the option of studying 
mixed ensembles. These are defined by treating a subset of the conserved quantities 
canonically and the complementary subset of conserved quantities microcanonically. In 
Section 5 we derive large deviation principles with respect to such ensembles and give 
complete results on their equivalence and nonequivalence, at the level of equilibrium 
macrostates, with the microcanonical ensemble and the canonical ensemble. Although 
mixed ensembles arise naturally in a number of applications, they have not been studied 
in a general setting in the statistical mechanical literature. 

An important application of mixed ensembles is to the study of dispersive waves and 
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soliton turbulence for the nonlinear Schrodinger equation [17]. This equation has two con- 
served quantities, the Hamiltonian and the particle number. In the associated statistical 
mechanical model, the canonical ensemble cannot be defined because the partition func- 
tion does not converge. Instead, one must consider either a microcanonical ensemble or a 
mixed ensemble in which the Hamiltonian is treated canonically and the particle number 
microcanonically. By applying to the mixed ensemble a large deviation result for Gaussian 
processes derived in |[L8|| , in |17| we are able to justify rigorously a mean-field theoretic 



approach to soliton turbulence presented in [23|. The agreement between the predictions 
of the statistical mechanical model and long-time simulations of the microscopic dynamics 
is excellent [23|. 



1.2 Ensembles and Large Deviation Principles 

The Euler equations describe the time evolution of the velocity field of an inviscid, in- 
compressible fluid in a spatial domain, which for simplicity we take to be the unit torus 
T 2 with periodic boundary conditions. At time t > the velocity field at a position 
x = (xi, x 2 ) G T 2 is denoted (v i(x, t), v 2 (x, t)). The Euler equations can be cast in the form 
of an infinite-dimensional Hamiltonian system having a family of other conserved quanti- 
ties called generalized enstophies. A central goal of theoretical, numerical, and statistical 
studies is to relate the asymptotic behavior of the vorticity w(x,t) = v 2 , Xl (%, t ) — v± iX2 (x, t) 
to the formation, interaction, and persistence of coherent structures of the fluid motion. 
A model that can be used to carry this out was proposed independently by Miller et. 



al. [pq , [39H and Robert et. al. [43], [PJ and is known as the Miller- Robert model. In order to 
define it, one first discretizes the continuum dynamics described by the Euler equations, 
and then in terms of the discretized dynamics one defines a sequence of statistical equilib- 
rium models on suitable finite lattices C n of T 2 . Details are given in part (b) of Example 
|2.3| . These lattice models describe the joint probability distributions of certain vorticity 
random variables ((s) defined for each site s G £„. We denote by ( the configuration or 
microstate {C( s )> s G C n }; by a n the number of sites in C n ; by y the common range of 
C(s); by H n (() the Hamiltonian for (, which is a certain quadratic function of the ((s) 
that approximates the continuum Hamiltonian; by A n (() the generalized enstrophy of (, 
which approximates the continuum generalized enstrophy; and by P n the prior distribu- 
tion of (, which is a certain product measure on the configuration space y a ™. In order 



to simplify the present description, we absorb A n in P n ; in |j| a physical justification is 
given, in the context of a related model, for absorbing the generalized enstrophy A n in 
the prior distribution P n . Thus for the purpose of this introduction, the Miller- Robert 
model is defined in terms of a single conserved quantity, the Hamiltonian. As in many 
other models of turbulence, the Hamiltonian in the Miller-Robert model has a long-range 
interaction and incorporates a local mean-field scaling. 

For other models of turbulence having the Hamiltonian as the only conserved quantity, 
much of the following discussion is valid with minimal changes in notation; in particular, 
the forms of the large deviation principles in the present subsection and the results on 
equivalence and nonequivalence of ensembles in the next subsection. For models having 
multiple conserved quantities, the following discussion is easily adapted by replacing cer- 
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tain scalars with vectors. The general class of models considered in this paper is defined 
in terms of the quantities in Hypotheses |2.1| . In order for a large deviation analysis of the 
model to be feasible, these quantities must satisfy Hypotheses |2^ . 

We begin our overview of the main results in this paper by appealing to the formalism 
of equilibrium statistical mechanics, which provides two joint probability distributions 
for microstates ( G y a ™. The physically fundamental distribution known as the mi- 
crocanonical ensemble models the fact that the Hamiltonian is a constant of the Euler 
dynamics. Probabilistically, this is expressed by conditioning P n on the energy shell 
{C G y a " '-H n (() = u}, where u G M is determined by the initial conditions. However, in 
order to avoid problems with the existence of regular conditional probability distributions, 
we shall condition P n on the thickened energy shell {H n (() G [u — r,u + r]}, where r > 0. 
Thus, the micro canonical ensemble is the measure defined for Borel subsets B of y an by 

this is well defined provided the denominator in the last expression is positive. The letter 
u is used in the definition of the microcanonical ensemble rather than the more usual letter 
E because this is a special case of a general theory that applies to models having multiple 
conserved quantities; for such models u E Mis replaced by a vector u representing a fixed 
value of the vector of conserved quantities. 

A mathematically more tractable joint probability distribution is the canonical ensem- 
ble, defined for Borel subsets B of y an by 



PnAB} = ~rv7~~Q\ ■ l zM-PHn] dP n . 

z '\ n iP) Jb 



Here (3 is a real number denoting the inverse temperature and Z(n, (3) is the partition func- 
tion jy an exp[— /3H n ] dP n . This is a normalization constant that makes P n $ a probability 
measure. 

The main mathematical tool that we shall use to predict the formation of coherent 
structures is the theory of large deviations. In the case of the Miller-Robert model, a 
crucial innovation implemented in M for the canonical ensemble is to study the asymptotic 
behavior of a random probability measure Y n (() that is closely related to a certain coarse 
graining of the random vorticity field (see part (b) of Example |2.ci| ). This coarse graining 
is defined in terms of the empirical measures of ((s) for s in certain macrocells of the 
lattice C n . Y n takes values in a certain subset X of the space of probability measures 
on T 2 x y. Elements /i of X are called macrostates. While Y n is basic to analyzing the 
asymptotic behavior of the model, its definition is far from obvious. For that reason we 
call Y n a hidden process and X a hidden space for the Miller-Robert model. 

The hidden process Y n has two properties that make a large deviation analysis of 
the Miller- Robert model possible. For details, the reader is referred to 0. First, an 
application of Sanov's Theorem shows that with respect to the a priori distribution P n , 
Y n satisfies the large deviation principle on X with rate function I{fj) given by the relative 
entropy of \i e X with respect to a certain base measure. We record this fact by the formal 
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notation 

P n \Xn £ B([jl, a)} « 6xp[— a n /(//)] as n — > oo, a — > 0. (1.2.1) 

In this formula B(fi,a) denotes the open ball with center \i and radius a with respect 
to an appropriate metric on X . Second, there exists a bounded continuous function H 
mapping X into M with the property that uniformly over microstates the Hamiltonian 
H n (Q is asymptotic to H(Y n (Q) as n — ► oo; in symbols, 

lim sup |iJ ft (C)-if(Y„(C))| = 0. (1.2.2) 

if is called the Hamiltonian representation function. 

Using (1.2.2), one derives from the large deviation principle for the P n -distributions 



of Y n the asymptotic behavior of Y n with respect to the two ensembles P™' r and P n) a n p- 
For appropriate values of u and /3 these are expressed by the formal notation 

PY\Xn e B(/i, a)} « exp[-a n /"(/i)] as n -> oo, r -> 0, a -> (1.2.3) 

and 

^n,a n /3{^n G S(/i, a)} « exp [-a^ (//)] as n -> oo, a -> 0. (1-2.4) 

In these formulas J" and are rate functions that map X into [0, oo] and are defined 
in terms of the relative entropy I appearing in ( |1.2.1|) . Because the Miller- Robert model 



is defined in terms of a long-range interaction having a local mean-field scaling, in order 
to obtain a nontrivial asymptotic theory (3 must be scaled by a n in the definition of the 
canonical ensemble P n ^ H §3] . For the general formulation of (p. .2 .3 ) and (|1.2.4j) as large 



deviation principles for a general class of models, the reader is referred to Theorem [J72 
and Theorem |2.4j , respectively. 

It is not difficult to motivate the forms of I u and Ip. In order to do so, we introduce two 
basic thermodynamic functions, one associated with each ensemble. Since the ground- 
breaking work of Lanford on equilibrium macrostates in classical statistical mechanics 



311 ) it has been recognized that the basic thermodynamic function associated with the 
microcanonical ensemble is the microcanonical entropy s. In terms of the distribution 
P n {H n G •}, this quantity measures the multiplicity of microstates ( G y an consistent 
with a given energy value u. It is defined by 

s(u) = lim lim — log P n {H n G [u — r, u + r]}. (1.2.5) 

r— >0 n^oo a„ 



For appropriate values of u, the limit exists and is given by (|3.2|) , which is a variational 
formula over macrostates \i. For (3 G M the basic thermodynamic function associated 
with the canonical ensemble is the canonical free energy 

<p(/3) = - lim — log Z{n, a n {3). (1.2.6) 

The limit exists and is given by (|2.6|) , which is also a variational formula over macrostates. 
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We first motivate the form of Ip. If Y n e B(fi, a), then for all sufficiently small a and 
all sufficiently large n (|1.2.2|) implies that 

H n (Q « H(Y n (()) » 

Hence for all sufficiently small a and all sufficiently large n, the asymptotic formula ( |1.2.1| ) 
and the definition of tp yield 

P n ,anp{Yn E B(fi, a)} = 1 / exp[-a n/ 9# n ] dP n 

z l^PJ J{y n eB(^a)} 
1 

exp[-a n /5iJ(/i)] P n {Y n E B(n, a)} 



Z(n,/3) 

exp[-a n (/(/i)+/3#(//)-^(/3))]. 



Comparing this with the desired asymptotic form ( |1.2.4| ) motivates the formula 



Ip(n)=lQi)+PH(li)-<p(P). (1.2.7) 

The actual proof of the large deviation principle for the P njan/ 3-distributions of Y n with 
this rate function follows the sketch presented here and is not difficult. Related large 
deviation principles have been obtained by numerous authors. 

We now motivate the form of I u . Suppose that H(fi) = u. Then for all sufficiently 
large n depending on r the set of ( for which both Y n (() 6 £>(/!, a) and H n (Q 6 [it— r, u+r] 
is approximately equal to the set of ( for which both Y n (() 6 B(fi,a) and H(Y n (()) e 
[« — r, u + r]. Since i7 is continuous and = u, for all sufficiently small a compared 

to r this set reduces to {( : Y n (Q G B(fi,a)}. Hence for all sufficiently small r, all 
sufficiently large n depending on r, and all sufficiently small a compared to r, ( |1.2.1| ) and 
the definition ( |1.2.5| ) of s yield 



P u,r IV c R /, Pn{{Y n e B(fj,, a)} H {H n e [u - r, u + r]}} 

n iYneJJ ^ a " ~ P n {H n e[u-r,u + r]} 

Pn{Y„e B(n,a)} 



P n {H n e[u-r,u + r}} 
« exp[-a n (/(/i) + 

On the other hand, if -P(/i) 7^ w, then a similar calculation shows that for all sufficiently 
small r, all sufficiently small a, and all sufficiently large n P™ ,r {Y n e -B(/i, a)} = 0. Com- 
paring these approximate calculations with the desired asymptotic form (|1.2.3| ) motivates 
the formula 

n ) = r /(a*) + *(«) ^ = «, (L28) 

[00 if -n (/x) 7^ u . 

In Section 3 we offer two proofs of the large deviation principle for the P^ ,r -distributions 
of Y n . Both are straightforward; the first follows fairly closely the heuristic sketch just 



given. Forms of this large deviation principle are given, for example, in [O. |32. |33l . 
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The asymptotic formulas ( |1.2.3| ) and ( |1.2.4| ) give rise to several interpretations of the 
rate functions. Through the distributions P^ ,r {Y n G •} and -P n ,a n /3{^n G •}, I u and Ip 
measure the multiplicity of microstates ( G y an consistent with a given macrostate [i. 
Because of these asymptotic formulas, it also makes sense to say that for i = I u or i = Ip 
a macrostate G X is more predictable than a macrostate [i2 G X if < 2(^2)- Since 
i is nonnegative, the most predictable or most probable macrostates /i solve = 0. It is 
natural to call such fi equilibrium macrostates. Specifically, fi G X satisfying I u (fi) = is 
called a microcanonical equilibrium macrostate; S u denotes the set of all such macrostates. 
Analogously, a measure /i G X satisfying Ip(fi) = is called a canonical equilibrium 
macrostate; £p denotes the set of all such macrostates. In terms of equilibrium macrostates 
/1, one can analyze the formation of coherent structures by defining the mean vorticity 
as an appropriate average of \i and comparing it, say by simulation, with the long-time 
behavior of the vorticity cu(x, t) = v 2)Xl (x,t) — v% !X2 (x,t) as given by the Euler equations 
59, M, M, |52|. 



1.3 Equivalence and Nonequivalence of Ensembles 

The microcanonical ensemble is physically fundamental, and the canonical ensemble can 
be heuristically derived from it by considering a small subsystem of a large reservoir 
|H . Aside from physical considerations concerning which ensemble is more appropriate 
in the construction of a statistical model, the more mathematically tractable canonical 
ensemble is often introduced as an approximation to the microcanonical ensemble, which 
is somewhat difficult to analyze. However, in order to justify this use of the canonical 
ensemble, one must address a basic issue. At the level of equilibrium macrostates, do 
the two ensembles give equivalent results? This involves answering the following two 
questions. 

1. For every (3 and every fi in the set Bp of canonical equilibrium macrostates, does 
there exist a value of u such that fi lies in the set B u of microcanonical equilibrium 

macrostates? 

2. Conversely, for every u and every fi G S u does there exist a value of (3 such that 

^e£/3? 

Whether or not the answers are yes, a more refined issue is to determine the precise 
relationships between £ u and £p. For example, if the answers are both yes, then given (3 
in question 1 (resp., u in question 2), how does one determine the corresponding value 
of u (resp., 0)1 It is with these issues, appropriately formulated in terms of a general 
class of models having multiple conserved quantities, that Sections 4 and 5 of the present 
paper is occupied. In those sections definitive and sharp results on the equivalence and 
nonequivalence of ensembles are derived. 

As we will see, in general question 1 in the preceding paragraph has the answer yes; 
namely, every // G £p lies in E u for some value of u. As we illustrate by a number of 
examples given in Section 1.4, question 2 can have the answer no; namely, it can be 
the case that the set of microcanonical equilibrium macrostates is richer than the set 
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of canonical equilibrium macrostates. As we show in Theorem [4.4| , this behavior has a 
striking formulation in terms of the microcanonical entropy s, which is defined in ( p.,2.5 ). 
If s is not concave at a given value of u, then the ensembles are nonequivalent in the sense 
that £ u is disjoint from the sets £p for all values of j3. 

This general result has been anticipated in a number of works, including those dis- 
cussed in Section 4.2 of |4l| and in [JZ7| , ^J. These works exhibit nonconcave entropy 



curves for a number of physical models that include a gravitating system of fermions and 
a system of circular vortex filaments in an ideal fluid confined to a three-dimensional 
torus; see Fig. 34 in [4J|, Fig. 3 in and Fig. 2 in [29|. They also point out that certain 



equilibrium macrostates corresponding to nonconcave portions of the entropy are only 
realizable in the continuum limit of the microcanonical ensemble but not of the canonical 
ensemble. Other examples of nonconcave entropies are given in Section 1.4 of the present 
paper. 

The question as to whether the microcanonical and canonical ensembles give equiva- 
lent results at the level of equilibrium macrostates is formulated as a problem in global 
optimization. Let u and f3 be given. By definition, a macrostate fl belongs to £ u if 
and only if I u (ji) = 0. This is the case if and only if p, solves the following constrained 
minimization problem: 

minimize J(/i) over /i£^ subject to the constraint H(fx) = u; (1.3.1) 

it is worth noting that since the relative entropy I(fi) equals negative the physical entropy, 
this display defines a maximum entropy principle with the energy constraint H(n) = u. 
By definition, a macrostate /2 belongs to £p if and only if Ip{p) = 0. This is the case if 
and only if p, solves the following unconstrained minimization problem: 

minimize (I(fJL) + (3H(n)) over fi 6 X. (1.3.2) 

In the unconstrained problem (3 is a Lagrange multiplier dual to the constraint H(fi) = u 
in ( |1.3.1|) . Under general conditions, solutions of the constrained minimization problem 
( P--3.ll ) are extremal points of (/ + (3H) on X The question as to whether the 

microcanonical and canonical ensembles give equivalent results is equivalent to answering 
the following refined question related to this property. What are the relationships between 
the sets of solutions of the constrained and unconstrained minimization problems (|1.3.1|) 
and 03)? 



We now describe our results on the equivalence and nonequivalence of ensembles by 
relating them to the behavior of the two basic thermodynamic functions, s and (p. The 
following discussion applies to the Miller- Robert model as well as to a class of other models 
that have the Hamiltonian as a single conserved quantity. The discussion generalizes to 
a wide class of other models having multiple conserved quantities. We first motivate a 
formula relating s and tp. To do this, we use the definition of s, which we summarize by 
the formula 

P n {H n e du} « exp[a n s(u)} du. 

We now calculate 

= - lim — log Z(n, a n /3) 

rwoo a n 
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lim — log / exp[-a n /3H n ] dP n 

n-*oo a n Jy an 

l r 

lim — log / exp[— a n (3u] P n {H n G du] 

1 f 

lim — log / exp[— a n {(3u — s{u))} du. 
n— oo a n J M 



According to the heuristic reasoning that underlies Laplace's method, the main contribu- 
tion to the integral comes from the largest term. This motivates the relationship 

<p{/3) = M{/3u-s(u)}, (1.3.3) 

which expresses tp as the Legendre-Fenchel transform s* of s. 

For the Miller-Robert model and other models of turbulence considered in this paper, 
s is nonpositive and upper semicontinuous on M [Prop. |3j](a)]. If it is the case that s is 



concave on IR, then ( |1.3.3| ) can be inverted to give s in terms of </?; namely, for all u G M 



s(u) = mf{(3u-<p((3)}. (1.3.4) 

Hence, when s is concave on M, each basic thermodynamic function can be obtained from 
the other by a similar formula. It is natural to say that in this case the microcanonical 
ensemble and the canonical ensemble are thermodynamically equivalent |28| , j33)l . As we 



will see in Theorems |4.4j and |4.9|, thermodynamic equivalence of ensembles is mirrored by 



equivalence-of-ensemble relationships at the level of equilibrium macrostates. 

By virtue of its definition ( |1.2.6| ) or formula ( |1.3.3[ ), <p is a finite, concave, continuous 



function on IR. In the case of classical systems such as considered by Lanford |31[, a 
superadditivity argument based on the fact that the underlying Hamiltonian has finite 
range shows that the analogue of s is an upper semicontinuous, concave function on M. In 
general, however, because of the local mean-field, long-range nature of the Hamiltonians in 
the Miller-Robert model and other models of turbulence considered in this paper, the as- 
sociated microcanonical entropies are typically not concave on subsets of M corresponding 
to a range of negative temperatures. 

In order to see how concavity properties of s determine relationships between the sets 
of equilibrium macrostates, we define for u G IR the concave function 

s**(u) = M {f3u - s*(P)} = M {(3u - <p{p)}. (1.3.5) 

Because of ( |1.3.4j) , it is obvious that s is concave on IR if and only if s and s** coincide. 
Whenever s{u) > — oo and s(u) = s**(u), we shall say that s is concave at u. 

Now assume that s is not concave on M; i.e., there exists u G IR for which — oo < 
s{u) 7^ s**{u). In this case, one easily shows that s** equals the smallest upper semicon- 
tinuous, concave function majorizing s. In particular, when s is not concave on M, it 
cannot be recovered from ip via a Legendre-Fenchel transform. 

As we now explain, concavity and nonconcavity properties of the microcanonical en- 
tropy s have crucial implications for the equivalence and nonequivalence of ensembles 
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at the level of equilibrium macro-states. In terms of such properties of s, we now give 
preliminary and incomplete statements of the relationships between the sets B u and Bp of 
equilibrium macrostates for the two ensembles. The reader is referred to Theorems |4.4j , 
[4.6| , and [4.8| for precise statements. For easy reference they are summarized in Figure [I] 
in Section 4. 

For a given value of u, there are three possible relationships that can occur between 
B u and Bp. If there exists a value of (3 such that B u = Bp, then the ensembles are said to 
be fully equivalent. If instead of equality B u is a proper subset of Bp for some [3, then the 
ensembles are said to be partially equivalent. It may also happen that B u n Bp = for all 
values of (3. If this occurs, then the microcanonical ensemble is said to be nonequivalent 
to any canonical ensemble or that nonequivalence of ensembles holds. It is convenient 
to group the first two cases together. If for a given u there exists (3 such that either B u 
equals Bp or B u is a proper subset of Bp, then the ensembles are said to be equivalent. 

The relationships between B u and Bp depend on concavity and nonconcavity properties 
of s, expressed through the equality or nonequality of s(u) and s**(u). These relation- 
ships are given next in items 1-3 together with references to where the results are stated 
precisely. Criteria for equivalence of ensembles related to item 2 have been obtained in 
various settings by a number of authors, including |12|, [L9|, [$2|, However, the results 
underlying items 1 and 3 are new. 

1. Canonical is always microcanonical. For every j3 and every fi G Bp, there exists 
u such that fi G B u [Theorem [4.6|| . 



2. Equivalence. If — oo < s(u) = s**(u) — i.e., if s is concave at u — then there exists 
13 such that the ensembles are equivalent [Remark ^]2| and Theorem |4.4| (a)1. 



3. Nonequivalence. If — oo < s(u) ^ s**(u) — i.e., if s is not concave at u — then the 
corresponding microcanonical ensemble is nonequivalent to any canonical ensemble 
[Remark [4.2| and Theorem |4.4| (b)l. 

Let u be a point in M such that s(u) > — oo. According to items 2 and 3, the ensembles 
are equivalent if and only if s is concave at u. Under another natural hypothesis on u, 
one shows that s is concave at u if and only if there exists a supporting line to the graph 
of s at (u,s(u)) [Lem. fO](a)]; i.e., there exists (3 G M such that 

s(w) < s(u) + (3(w — u) for all w G M. 

In Theorem |4.8| we refine this necessary and sufficient condition for equivalence of en- 
sembles by showing that the ensembles are fully equivalent if and only if there exists a 
supporting line to the graph of s that touches the graph of s only at (u, s(u))\ i.e., there 
exists (3 G M such that 

s(w) < s(u) + (3{w — u) for all w ^ u. 

A sufficient condition that guarantees this property of s is that s(u) = s**(u) and s** is 
strictly concave in a neighborhood of u. 
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The relationships given in items 1-3 refine the relationships between the thermody- 
namic functions ip and s. In fact, the thermodynamic equivalence of ensembles that holds 
when s = s** on M is reflected in the equivalence of ensembles for a given value of u when 
— oo < s[u) = s**(u) [item 2]. On the other hand, when — oo < s(u) ^ s**(u) for some 
value of u, the lack of symmetry between <p and s as expressed by (|1.3.3| ) and ( |1.3.5|) is 
mirrored by a lack of symmetry between the microcanonical and canonical ensembles at 
the level of equilibrium macrostates. For each ft, every canonical equilibrium macrostate 
in Ep lies in £ u for some u [item 1]. However, for any u for which — oo < s(u) ^ s**(u) the 
corresponding microcanonical ensemble is nonequivalent to any canonical ensemble [item 
3]. 

We also prove a number of interesting results that follow easily from the main theorems. 
For example, in Corollary |4.7| we show that if Ep consists of a unique macrostate /i, then 
E u consists of the unique macrostate \i for a corresponding value of u (u = if (//)). The 
uniqueness of an equilibrium macrostate corresponds to the absence of a phase transition. 



1.4 Examples of Nonconcave Microcanonical Entropies 

The most striking of our results on equivalence and nonequivalence of ensembles is given 
in item 3 near the end of the preceding subsection. If, for a given value of u, — oo < 
s(u) 7^ s**(u), then E u is disjoint from the sets Ep for all values of ft. We next point out 
a number of statistical mechanical models having a nonconcave microcanonical entropy 
and thus exhibiting, for a range of values of u, the nonequivalence of ensembles that is 
formulated in item 3. 



1. Point vortex system. This is the first statistical mechanical model proposed in 
the literature for studying the two-dimensional Euler equations. It is defined in 
terms of a singular interaction function, which is a Green's function. The model 



was introduced by Onsager f41fl ; was further developed in the 1970's, notably by 
Joyce and Montgomery [ |25|| ; and continues to be the subject of important studies, 
including || §, ^, P^j . Proposition 6.2 in J| isolates a class of flow domains 
for which the microcanonical entropy in the point vortex model is not a concave 
function of its argument. As pointed out in [|28], §6], the Monte Carlo study of a point 
vortex system in a disk carried out in |4l| also displays a nonconcave microcanonical 
entropy. Strictly speaking, the results on nonequivalence of ensembles given in the 
present paper apply only to a point vortex model in which the singular interaction 
function in the classical model has been regularized; see part (a) of Example |2.3| . 
Nevertheless, special arguments can be invoked to extend them to the classical model 
with singular point vortices. 

2. Two-dimensional turbulence. A natural generalization, and also regularization, 
of the point vortex model is the Miller-Robert model. In an unpublished numer- 
ical study, Turkington and Liang consider the Miller-Robert model in a disk with 
constraints on the energy, the total circulation, and the angular momentum (or 
impulse) and with a prior distribution on the vorticity that corresponds to vortex 
patch dynamics; this problem is the simplest Miller- Robert analogue of the problem 
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studied in [18 in the point- vortex formulation. For fixed values of the total circu- 
lation and the angular momentum, Turkington and Liang compute microcanonical 
entropies as a function of energy using the algorithm developed in [^TJ. They find 



that the microcanonical entropy-energy curve is concave on a certain interval and 
nonconcave on a complementary interval. These computations produce equilibrium 
macrostates that are vortices embedded in circular shear flows. 

3. Quasi-geostrophic turbulence on a /3-plane. The statistical equilibrium models 
proposed in |^D[ are implemented in [|TJ] for barotropic, quasi-geostrophic flow in a 
channel on the /3-plane. Various prior distributions on the potential vorticity are 
considered; these include a saturated model, in which the maximum and minimum 
of the potential vorticity constrain the microstate, and a dilute model, in which 
only the mean potential-vorticity magnitude is imposed. Even in the absense of 
geophysical effects (/3 = 0), the dilute model exhibits a nonconcave entropy-energy 



curve, as displayed in Figure 4 of ||14|| . The equilibrium macrostates corresponding 
to values of the energy for which the entropy is nonconcave are shears that transition 
to monopolar vortices and then to dipolar vortices as the energy increases. When 
the dilute model is replaced by the corresponding saturated model, in which an 
upper bound on the microscopic potential vorticity is enforced, the equilibrium 
macrostates are modified, particularly at high energies. As is shown in Figure 16 of 



14], the nonconcavity of the entropy-energy curve persists at low energies; at high 
enough energies, however, it becomes concave, unlike in the dilute case. At these 
high energies the equilibrium macrostates are not dipolar vortices, but rather shear 
flows. 

4. Quasi-geostrophic turbulence over topography. A more complete study of 
the concavity of the microcanonical entropy is carried out in JL£| for equivalent- 
barotropic, quasi-geostrophic flow over bottom topography on an /-plane. As in [[TJ] 
a channel geometry is imposed, but for simplicity only shear flows are considered. 
Within this symmetry class, the topography is chosen to be sinusoidal, the energy 
and circulation are used as global invariants, and the prior distribution is taken to 
be a Gamma distribution with mean 0, variance 1, and nonzero skewness. As a 
function of the energy and the circulation, the entropy is nonconcave in more than 
half of its domain. These two-constraint results are described in detail in Section 6 

of ra. 



5. Two-layer quasi-geostrophic turbulence. The one-layer model studied in 



is extended to a two-layer system in |TB[ , where it is used to describe the physically 
important phenomenon of open-ocean convection. In Figures 2 and 12 in that paper, 
the entropy-energy curve is seen to be nonconcave; the microcanonical equilibrium 
macrostates corresponding to values of the energy in the nonconcave region are 
asymmetric baroclinic monopoles. 
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1.5 Contents of This Paper 

In Section 2 we introduce the class of statistical mechanical models that will be analyzed 
in this paper. These models generalize the Miller-Robert model by incorporating a finite 
sequence of interaction functions H ni rather than just the Hamiltonian. In order to carry 
out the large deviation analysis, we assume that there exists a hidden process Y n that takes 
values in a complete separable metric space X and has the following two properties: (a) 
for each interaction function there exists a representation function Hi such that uniformly 
over microstates \H n ^ — Hi o Y n \ — > as n — > oo; (b) with respect to the prior measure P n 
in the model, Y n satisfies the large deviation principle on X . In Section 2 we show that 
with respect to the canonical ensemble Y n satisfies the large deviation principle, and we 
derive several properties of the set of canonical equilibrium macrostates. 

In Section 3 we consider the microcanonical ensemble, proving a large deviation princi- 
ple and studying properties of the set of microcanonical equilibrium macrostates. We also 
point out the constrained maximum entropy principles that characterize microcanonical 
equilibrium macrostates in certain models including the Miller-Robert model. 

Section 4 is devoted to the presentation of our complete results on the equivalence and 
nonequivalence of the two ensembles. The results are proved in Theorems [4.4| , [4.6| , and 
[4.8| and are summarized in Figure |T[ 

In Section 5.1 we introduce mixed ensembles obtained by treating a subset of the 
dynamical invariants canonically and the complementary subset of dynamical invariants 
microcanonically. We then prove the large deviation principle for these ensembles. Sec- 
tion 5.2 presents complete equivalence and nonequivalence results for the pure canonical 
and mixed ensembles while Section 5.3 does the same for the mixed and the pure micro- 
canonical ensembles. The results in Sections 5.2 and 5.3 follow from those in Section 4 
with minimal changes in proof. They are summarized in Figures |2| and [3[ 

Acknowledgement. We thank Michael Kiessling for a number of useful conversations. 



2 Canonical Ensemble: LDP and Equilibrium Macrostates 

In this section we present a large deviation principle for the canonical ensemble in a wide 
range of statistical mechanical models [Thm. |2.4j (b)1- In terms of that principle, the set 
of canonical equilibrium macrostates is defined and some of its properties derived [Thms. 
p.4| (c)- |2~5[| . After defining the class of models under consideration, we specify in Example 
[2.3| a number of specific models to which the theory applies. 

The models that we consider are defined in terms of the following quantities. 

Hypotheses 2.1. 

• A sequence of probability spaces (fl n , !F n , P n ) indexed by n G IV; Q n are the config- 
uration spaces for the statistical mechanical models. 
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• A positive integer a and for each n G IN a sequence of interaction functions {H n> i, i = 
1, . . . , a}, which are bounded measurable functions mapping Q n into M. We define 
H n = (H nt i, . . . , H n>a ), which maps fl n into M a . 

• A sequence of positive scaling constants a n — > oo. 

Let (-, •) denote the Euclidean inner product on M a . We define for each n G IN, 
(3 — . . . , f3 a ) G iR°", and set B G >F n the partition function 

^n(/3)= / exp -J2& H n,i dP n = I exp[-((3,H n )]dP n , 

J fin. 2 = 1 71 



which is well defined and finite, and the probability measure 

PnAB} = ^- ) J B exp[-(f3,H n )}dP n . (2.1) 

The measures P n ^ are Gibbs states that define the canonical ensemble for the given model. 
For (3 G M a , we also define 

if(P) = - lim — log Z n (a n P) 

rwoo a n 

if the limit exists and is nontrivial. In this formula (3 is scaled with usual in 

studying the continuum limit of models of turbulence 0, §3]. We refer to y?(/3) as the 
canonical free energy. If a = 1 and H n i is the Hamiltonian of the system, then /5 = /3i is 
the inverse temperature. 

The first application of the theory of large deviations in this paper is to express <£>(/?) 
as a variational formula. Let X be a Polish space (a complete separable metric space), 
Y n random variables mapping Q n into X, Q n probability measures on (f2 n ,jF n ), and / 
a rate function on X . Thus I maps A" into [0, oo] and for each M G [0, oo) the set 
{x G X : J(x) < M} is compact (compact level sets). For A a subset of X, we define 
1(A) = infa-g^ I(x). We say that with respect to Q n the sequence Y n satisfies the large 
deviation principle, or LDP, on X with scaling constants a n and rate function I if for any 
closed subset F of X the large deviation upper bound 

lim sup — log Q n {Y n G F} < -1(F) (2.2) 

n— >oo Q"n 

is valid and for any open subset G of X the large deviation lower bound 

lim inf — log Q n {Y n G F} > -1(G) (2.3) 

is valid. We say that with respect to Q n the sequence Y n satisfies the Laplace principle on 
X with scaling constants a n and rate function / if for all bounded continuous functions / 
mapping X into M 

lim — log / exp[a n f(Y n )}dQ n 

= lim — log / exp[a n f(x)] Q n {Y n G dx} = sup{/(a;) - I(x)}. 

n^oo a n J x x( ix 
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As pointed out in Theorems 1.2.1 and 1.2.3 in []T5 |, Y n satisfies the LDP with scaling 
constants a n and rate function I if and only if Y n satisfies the Laplace principle with 
scaling constants a n and rate function /. Evaluating the large deviation upper bound 



( |2.2| ) for F = X and the large deviation lower bound ( |2.3| ) for G = X yields I(X) = 0, 



and since I is nonnegative and has compact level sets, the set of x G X for which I[x) =0 
is nonempty and compact. In the sequel we shall usually omit the phrase "with scaling 
constants a n " in the statements of LDP's and Laplace principles. 

A large deviation analysis of the general model is possible provided we can find, as 
specified in Hypotheses pT2] , a hidden space, a hidden process, and a sequence of interaction 
representation functions, and provided the hidden process satisfies the LDP on the hidden 
space. 

Hypotheses 2.2. 

• Hidden space. This is a Polish space X. 

• Hidden process. This is a sequence Y n , where each Y n is a random variable 
mapping Q n into X. 

• Interaction representation functions. This is a sequence {Hi, i — 1, . . . , a} of 

bounded continuous functions mapping X into M such that as n — > oo 

H nji (u) = Hi{Y n {uj)) + o(l) uniformly for u G Q n ; (2.4) 



i.e., lim^oo sup wgQn 
maps X into M a . 



H n>i (u) - Hi(Y n (u)) = 0. We define H = (H h ...,H a ), which 



• LDP for the hidden process. There exists a rate function / mapping X into 
[0, oo] such that with respect to P n the sequence Y n satisfies the LDP on X, or 
equivalently the Laplace principle on X, with rate function /. 

In this context we use the term "hidden" because in many cases the choices of the space 
X and the process Y n are far from obvious. ■ 

We next present several models of turbulence to which the results of this paper can 
be applied. 



Example 2.3. (a) Regularized Point Vortex Model. This model, analyzed in |1S 
is an approximation to the point vortex model, which we first define. Let A be a smooth, 
bounded, connected, open subset of M 2 ; g(x,x') the Green's function for —A on A with 
Dirichlet boundary conditions; h the continuous function mapping A into M defined by 
h(x) = ^g(x,x), where g(x,x') is the regular part of the Green's function g(x, x')\ and 
9 normalized Lebesgue measure on A satisfying 6(A) = 1. For n G W the point vortex 
model is defined on the configuration spaces fl n = A n with the Borel cx-field. P n equals 
the product measure on Q n with identical one-dimensional marginals 9, and a n = n. 
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Configurations ( G A n give the locations (\ } ... } ( n of the n vortices. The interaction 
function for the point vortex model is the Hamiltonian 

^(0 = ^ E 0(&g) + ^£mg)- (2.5) 

l<i<j<n l<i<n 

Because g(x, x') and h(x) are not bounded continuous functions of x and x' in A, the 
point vortex model cannot be studied by the methods of this paper, but must be analyzed 
by other techniques || [8], |9|, |28| . The regularized point vortex model is defined like 
the point vortex model except that in the formula for H n g(x, x') is replaced by a suitable 
bounded continuous function V(x, x') on A 2 and h is replaced by a suitable bounded 
continuous k on A. 

For the regularized point vortex model the hidden space is the space X of probability 
measures on A while the hidden process is the sequence of empirical measures 



1 n 



By Sanov's Theorem, this sequence satisfies the large deviation principle on X with rate 
function the relative entropy R{p\0) of \i with respect to 6 |L0|, |y], |15| . For p G X the 
interaction representation function is defined by 

= 7: / V(x,x') p(dx) p(dx'). 

The approximation property (|2.4| ) is easily verified. 

(b) Miller-Robert Model. This model of the two-dimensional Euler equations is 
analyzed in ||, which explains in detail the physical background. For simplicity, let the 
flow domain be T 2 , the unit torus [0, 1) x [0, 1) with periodic boundary conditions. For 
each n G IV let C n be a uniform lattice of a n = 2 2n sites t in T 2 . The intersite spacing in 
each coordinate direction is 2~ n . Each such lattice of a n sites induces a dyadic partition of 
T 2 into a n squares called microcells, each having area l/a n . For each s G C n we denote by 
M(s) the unique microcell containing the site s in its lower left corner. The configuration 
spaces for the Miller- Robert model are fl n = y an , where y is a given compact subset of 
M. Microstates are denoted by ( = {((s),s G £„}. Let p be a probability measure on 
M with support 3^- P n equals the product measure on Q n with identical one- dimensional 
marginals p. 

There are two classes of interaction functions, the Hamiltonian and the generalized 
enstrophies. For ( E fl n the Hamiltonian is defined by 



*M0 = 2^ £ 9n(s - s')C(s)C(s') 



,s'ec 



where g n (s — s') is a certain bounded continuous approximation to the lattice Green's 
function 

9(s-s')= J2 |2^r 2 exp[27rz(e,s- S ')]- 
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Fix a G IV. For i = 2, . . . , a + 1 the generalized enstrophies are defined by 



n 



where the are continuous functions mapping y into M. 



Hypotheses |2.2| are verified in ||, to which the reader is referred for details. Let 9 
denote Lebesgue measure on T 2 . The hidden space is the space X of probability measures 
p(dx x dy) on T 2 x y with first marginal 9. The hidden process is the sequence of measures 

Y n (dx x dy) = Y n ((,dx x dy) = 9(dx) <g> 2J 1m(s)(^) 5^ s )(dy). 
For p E X the Hamiltonian interaction function is given by 



Hi(fi) = — I g(x — x)yy p(dx x dy) //(da/ x dy') 



while for z = 2, . . . , a + 1 the interaction functions for the generalized enstrophies are 
given by 



Hi(fi) = / a,i(y) /i(dx x dy). 

For i — 1 one verifies ( |2.4| ) by a detailed Fourier analysis. For i = 2, . . . , a + 1 ( |2.4| ) is 
easily verified to hold with no error term. 

Given n e IN and an even integer q < 2n, we consider a dyadic partition of the lattice 
C n into 2 q blocks, each block containing a n /2 q lattice sites. In correspondence with this 
partition we have a dyadic partition {D q k, k = 1, . . . , 2 9 } of T 2 into macrocells. Each 
macrocell is the union of a n /2 q microcells M(s). The large deviation principle for Y n with 
respect to P n is verified by comparing Y n with the two-component process 

W niq (dx x dy) = W n>q (C, dx x dy) = 9(dx) ® ^ 1 D q , k (x) L n>qjk ((, dy), 

k=l 

where L n ^ k denotes the empirical measure a , 2t J2 S £D qk ^C(s)(dy). Through these empir- 
ical measures, W ntq introduces an averaging over the intermediate scale of the macrocells 
and thus corresponds to a coarse graining of the vorticity field. Using Sanov's Theorem, 
one verifies that as n —>■ oo, q —>■ oo, W nq satisfies the two-parameter LDP on X with 
rate function the relative entropy R(fi\9 x p) of fi(dx x dy) with respect to the product 
measure 9(dx) x p(dy) || §5]. An approximation result relating Y n and W n>q then allows 
one to prove that Y n satisfies the LDP on X with the same rate function. 

(c) Quasi-geostrophic potential vorticity model. This model of the quasi- 
geostrophic potential vorticity equation, described in detail in [H and jTjJ, incorporates 



the geophysical terms associated with the Coriolis effect, the deformation of an upper free 
surface, and bottom topography. The large deviation analysis of the model is carried out 
in Ul6 . 
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(d) Dispersive wave model for the nonlinear Schrodinger equation. This 
model is defined in |23|, [24J, to which the reader is referred for details. The hidden process 
is a Gaussian process taking values in L 2 [0, 1] and satisfying the LDP with respect to the 
prior distribution that is proved in |T3. The large deviation analysis of this model is the 



subject of 1 17 



We now return to the general model. Its large deviation analysis with respect to 
the canonical ensemble is summarized in the next theorem. Part (a) states a variational 
formula for (p(/3), and part (b) gives the LDP for the hidden process Y n with respect to 
the sequence of Gibbs measures P n ,p- Part (c) describes the set £p consisting of points 
at which the rate function in part (b) attains its minimum of 0. Part (d) gives a con- 
centration property of £p. As we point out after the statement of the theorem, £p can 
be identified with the set of equilibrium macrostates of the statistical mechanical model. 
The mathematical tractability of the canonical ensemble is reflected in the simplicity of 
the proof of Theorem [2.4| . 



Theorem 2.4. We assume Hypotheses \2.1\ and \L!2\ . For (3 G M a the following conclusions 
hold. 

(a) (p(/3) = — linin^oo — log Z n (a n f3) exists and is given by 

= inf (2.6) 

<p(/3) is a finite, concave, continuous function on M a . 

(b) With respect to P n ,a n p, Y n satisfies the LDP on X with rate function 

Ip{x) = I(x) + (13, H(x)) - inf {I(y) + (/3, H(y))} = I(x) + (f3, H(x)) - <p(p). 

(c) The set £p = {x 6 X : Ip(x) = 0} is a nonempty, compact subset of X . A point x 
lies in £p if and only if 

I(x) + (J3,H{x)) = m£{I(y) + (J3,H(y))} = <p(/3); 
yex 

equivalently, if and only if x solves the following unconstrained minimization problem: 

minimize (I(x) + (/3,H(x))) over x G X. 

(d) If A is any Borel subset of X whose closure A satisfies AC\£p — 0, then Ip{A) > 
and for some C < oo 

Pn, an i3{Y n EA}< Cexp[-a n I l3 (A)/2] ^0 as n -> oo. 

Proof, (a) Since Y n satisfies the LDP with respect to P n , Y n satisfies the Laplace principle 
with respect to P n with the same rate function /. Hence by the approximation property 
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and the boundedness and continuity of the function mapping x \— > ((3,H(x)), 
<p((3) = - lim — \ogZ n (a n (3) 

n— »oo a n 



- lim — log / exp[-a n ((3,H n )]dP n 
lim —log / exp[-a n {/3,H(Y n )}) dP n 



n-*oc a n JUn 

= MW,H(x)) + I(x)}. 

This formula exhibits (p as a finite, concave function on M a , which is therefore continuous 
on W. 

(b) Ip is a rate function since J is a rate function and the function mapping x \— > 
(/3,H(x)) is bounded and continuous. In order to prove that with respect to P n ,a n f3 Y n 
satisfies the LDP with rate function Ip, it suffices to prove that with respect to P n ,a n p Y n 
satisfies the Laplace principle with rate function I p. This is an immediate consequence of 
Q2.4| ) and part (a); for details, see the proof of part (b) of Theorem 3.1 in ||. 

(c) £p is a nonempty, compact subset of X because Ip is a rate function. The equivalent 
characterizations of x G £p follow from the definition of I p. 

(d) If A fl £p = 0, then for each x G A we have Ip(x) > 0. Since Ip is a rate function, 
it follows that Ip(A) > 0. The large deviation upper bound in part (b) yields the display 
in part (d) for some C < oo. The proof of the theorem is complete. ■ 

Part (d) of Theorem |2]4] can be regarded as a concentration property of the P n ,a n p- 
distributions of Y n . This property justifies calling £p the set of equilibrium macrostates 
with respect to Pn,a n p{Y n G dx} or, for short, as the set of canonical equilibrium macrostates. 

The next theorem further justifies the designation of £p as the set of canonical equi- 
librium macrostates by relating weak limits of subsequences of P n ^ an p{Y n G •} to £p. For 
example, if one knows that £p consists of a unique point x, then it follows that the en- 
tire sequence P n ,a n /3{Y n G •} converges weakly to 5^. This situation corresponds to the 
absence of a phase transition. For specific models, more detailed information about weak 
limits of subsequences of P n ,a„/3 have been obtained by a number of authors including 



Theorem 2.5. We assume Hypotheses 2.1 and |2.2|. For (3 G M a , any subsequence of 



Pn,a n p{Y n G ■} has a sub sub sequence converging weakly to a probability measure Hp on X 
that is concentrated on £p = {x G X : Ip(x) = 0}; i.e., Hp{(£p) c } = 0. If £p consists of a 
unique point x, then the entire sequence P n ,a n f3{Y n G •} converges weakly to 5%. 



Proof. Define a* = min ne jv a n > 0. As shown in the proof of Lemma 2.6 in p3) , the large 
deviation upper bound given in part (b) of Theorem |2.4| implies that for each M G (0, oo) 
there exists a compact subset K of X such that for all n G IV 

e -a n M e -a*M 



g LL n lVl g U, 1V1 

P n>an p{Y n G 7T} < T — 1 - < T —-^ . 
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It follows that the sequence P n ,a n /3{Yn G •} is tight and therefore that any subsequence 
has a subsubsequence P n ',a ,p{Xn' G •} converging weakly as n' — > oo to a probability 
measure Ilg on X [Prohorov's Theorem]. In order to show that IL3 is concentrated on 
S/3, we write the open set (£«) c as a union of countably many open balls Vj such that 



the closure Vj of each Vj has empty intersection with Sp. By part (c) of Theorem \L4 
Pn> ,a , p^Xri £ Vj} — > as n' — > 00, and so 

= liminf P n .^{Y n , G Vj} > U p {Vj}. 

TV— >00 

It follows that ILj{Vj} = and thus that Hp{(Sp) c } = 0, as claimed. 

Now assume that Sp = {x}. Then the only probability measure on X that is concen- 
trated on S/3 is 8x- Since by the first part of the proof any subsequence of P n ,a n i3{Yn £ 
•} has a subsubsequence converging weakly to 8$, it follows that the entire sequence 
Pn,a n p{Xn £ •} converges weakly to 8±. This completes the proof. ■ 

In the next section we consider the LDP for Y n when conditioning is present. 



3 Micro canonical Ensemble: LDP and Equilibrium 

Macrostates 

As in the preceding section, we consider models defined in terms of a sequence of interac- 
tion functions {H n ^, i = 1 . . . , a}, which are bounded measurable functions mapping Q n 
into M. In general, the interaction functions represent conserved quantities with respect 
to some dynamics that underlies the model. For suitable values of (ui, . . . ,u a ) G M a the 
ideal way to define the microcanonical ensemble is to condition the probability measure 
P n on the set {H n ^ — ui, . . . , H n u = u a }. However, in order to avoid problems concern- 
ing the existence of regular conditional probability distributions, we shall condition P n 
on {H n i G [ui — r, u\ + r], . . . , H n a G [u a — r, u a + r]}, where r G (0, 1). These condi- 



tioned measures, given in (|3.4j ), define the microcanonical ensemble. Theorem |3.2j proves 
the LDP for the distributions of Y n with respect to the microcanonical ensemble in the 
double limit obtained by sending first n — > 00 and then r — > 0. We then define, in terms 
of the rate function in this LDP, the set of microcanonical equilibrium macrostates and 
derive some of its properties. 

For u = {u\, . . . , u a ) G M a a key role in the large deviation analysis of the microcanon- 
ical ensemble is played by 

J{u) = M{I(x) : x G X, H(x) = u}. (3.1) 
In terms of J the canonical free energy <p(/3), given in part (a) of Theorem by 

<p(P) = wf{(J3,H(x)) + I{x)}, 
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can be rewritten as 

(p(B) = inf fm£{(8,H(x)) + I(x) : x e X,H(x)=u} 
■ueJR CT I 

= inf {(J3,u) + J(u)}. 

Introducing the microcanonical entropy 

s( u ) = -J(u) = - inf {I(x) : x G X, H(x) = u}, (3.2) 

we have 

<p{0)= inf {{{3,u)-s(u)}. (3.3) 

This formula expresses (p as the Legendre-Fenchel transform of s. The microcanonical 
entropy will play a central role in the results on equivalence and nonequivalence of the 
canonical and microcanonical ensembles to be presented in Section 4. 

The function J plays other roles in the theory. Since each Hi is a bounded continuous 
function mapping X into M and since with respect to P n Y n satisfies the LDP on X 
with rate function /, it follows from the contraction principle that with respect to P n 
H(Y n ) = (Hx(Y n ), . . .,H a (Y n )) satisfies the LDP on M u with rate function J Jl0|, Thm. 
4.2.1]. When expressed in terms of the equivalent Laplace principle, this means that for 
any bounded continuous function g mapping M a into M 

lim — log / exp[a n g(H(Y n ))] dP n = sup {g(u) - J{u)}. 

Because of the approximation property (|2.4| ), this readily extends to the Laplace principle 
on M a , and thus the LDP on M a , for H n = (H nA , H n>a ). 

In part (a) of the next proposition we record the LDP's just discussed and two proper- 
ties of the microcanonical entropy. When applied to the regularized point vortex model, 
the LDP for the P„-distributions of H n generalizes the large deviation estimates obtained 
in [ITS], Thm. 2.1]. In parts (b) and (c) of the proposition some related facts needed later 
in this section are given. We define dom J to be the set of u G M a for which J(u) < oo. 
For r G (0, 1) and u G dom J, we also define 

{u}^ = [ux — r, Ui + r] x • • • x [u a — r, u a + r]. 

Part (b) is a consequence of the LDP for H n given in part (a) and of the bound J(int({w}( r - ) )) < 
J(u). Part (c) follows from the lower semicontinuity of J and from part (b). 



Proposition 3.1. We assume Hypotheses |2.1| and \l.'^ . The following conclusions hold. 

(a) With respect to P n , the sequences H(Y n ) and H n satisfy the LDP on M a with rate 
function J. Hence s = —J is nonpositive and upper semicontinuous. 

(b) For u G dom J and any r G (0, 1) 

-J(u) < liminf — log P n {H n G {u} {r) } 

< limsup — \ogP n {H n G {w} (r) } < -J({w} (r) ). 

n^oo Q>n. 
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(c) Asr^O, J({u}W) / J{u). Hence 



lim lim — \ogP n {H n G {«} (r) } = -J(u). 



The main theorem of this section is the LDP for Y n with respect to the microcanonical 



ensemble, given in Theorem |3.2| . For A G T n this ensemble is defined by the conditioned 
measures 

P^ r {A} = P n {A\H n e{u}^}, (3.4) 

where u G dom J and r G (0, 1). For all sufficiently large n it follows from part (b) of 
Proposition |3J] that P„{H n G {u}^} > and hence that P%' r is well defined. 



Theorem 3.2. Take u G dom J and assume Hypotheses |2.1| and [2.2| . VFzt/i respect to the 
conditioned measures P%' r ' , Y n satisfies the LDP on X , in the double limit n — > oo and 
r — ► 0, tmt/i rate function 



ju, x ^ f I( x ) ~ J ( u ) if H(x)=u, 
\ oo otherwise. 



That is, for any closed subset F of X 



lim lim sup — \ogP^ r {Y n G F} < -I U (F) (3.5) 

r ^0 n— >oo O n 



and /or any open subset G of X 



lim liminf — logP^' r {F n E G} > -I U {G). (3.6) 

r^O n^oo a„ 



We first prove that I u defines a rate function. Clearly I u is nonnegative. For u G dom J 
and M < oo 

{x G # : J"(x) < M} = {x E X : I(x) <M + J(u)} fl fT 1 ^}). 

Since J(w) < oo, I has compact level sets, and is closed, it follows that J u has 

compact level sets. 



Concerning the large deviation bounds in Theorem |3.2| , we offer two proofs. The first 
is preferred because it is close to the heuristic sketch of the LDP given in the introduction. 
Throughout the two proofs we fix u G dom J. 

The first proof of the large deviation upper bound actually derives a stronger inequal- 
ity. Namely, for all sufficiently small r G (0, 1) and any closed subset F of X 

lim sup — log P^ r {Y n G F} < -I U {F). (3.7) 

For any x G X and a > we denote by B(x, a) and B(x, a) the closed ball and the open 
ball in X with center x and radius a. Let 5 > be given. Since I is lower semicontinuous, 
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for any x G X and all sufficiently small a > we have I(B(x, a)) > I(x) — 5. Now take 
any x (z X such that H(x) = u. For any r G (0, f ) and all sufficiently small a the large 
deviation upper bound for Y n with respect to P n and part (b) of Proposition yield 

limsup — log P^' r {Y n G B(x, a)} (3.8) 

< limsup — log P n {{Y n G B(x,a)} D {#„ G W (r) }} 

-liminf — logP n {#„G M (r) } 

< limsup — logP n {F n G a)} 

n— >oo Gri 

-liminf — log P n {H n e 

< -I{B(x,a)) + J(u) 

< -/(x) + J(u) + 5 
= -I u {x) +5. 

Now take any x E X such that 7^ u. Thus = 00, and there exists t G (0, 1) 

such that H(x) G" {u}^. By the approximation property ( |2.4j ) and the continuity of H, 
for any r G (0, i), all sufficiently small a > 0, and all sufficiently large n we have 

{Y n G n {#„ G C {F n G B(x,a)} PI G = 0. 

Hence for such r and a 

limsup — log Pn' r {Y n G a)} 

n— >oo Cbn 

< limsup — log P n {{Y n G B(x,ar)} PI {#„ G M (r) }} 

- lim inf — log P n {H n G {u}^} 
= -00 = -I u (x). 

We have proved that for any x G X, all sufficiently small r G (0, 1), and all sufficiently 
small a > 

limsup — logP„ M ' r {K„ G a)} < -I u (x) + 5. 

Let F be a compact subset of A\ We can cover F with finitely many closed balls B(xi, oti) 
with Xi G -F and «i > so small that the last display is valid for x = Xi, all sufficiently 
small r G (0, 1), and a = aj. It follows that for all sufficiently small r G (0, 1) 

lim sup — log P%> r {Y n G F} < — min I u { Xi ) +5< -1(F) + 5. 

Sending S — > yields the upper bound ( |3.7|) . Finally, for any closed set F the upper 
bound ( |3.7|) is a consequence of the following uniform exponential tightness estimate. 
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Lemma 3.3. Fix u G dom J. Then for all sufficiently large M G (0, oo) there exists a 
compact subset D of X such that for every r G (0, 1) 

limsup — log P^' r {Y n G D c } < —M. 
Proof. Given u G dom J, we take M > J(u). As shown in the proof of Lemma 2.6 in 



34], the large deviation upper bound satisfied by Y n with respect to P n implies that there 



exists a compact subset D of X such that 



limsup — log P n {Y n G D c } < -2M. 

n— »oo Oj n 



Since for every r G (0, 1) 

P n {Y n G D c } 



P^ r {Y n G D c } < 



P n {H n e{u}Vy 



it follows from part (b) of Proposition ^TT] that 



limsup — log P^ r {Y n G D c } 

< limsup — logP n {F n G D c } - liminf — \ogP n {H n G {u} {r) } 

n— >oo d"n n— >oo a n 

< -2M + J{u) < -M. 
This completes the proof. ■ 

We next prove the large deviation lower bound in Theorem [T2] by showing that for 
any fixed r G (0, 1) and any open subset G of X 

liminf — log P^ r {Y n eG}> -I U {G) + J({u} (r) ) - J{u). (3.9) 

Sending r — > and using part (c) of Proposition |3.1| yields the large deviation lower bound 



in Theorem 3.2 



Let x be any point in G such that H(x) = u. By the approximation property 
and the continuity of H, for any number r~ satisfying < r~ < r and all sufficiently large 
n, we can choose a > to be so small that B(x, a) C G and 

{Y n eB(x,a)}n{H n e{u}^} D {Y n eB(x,a)}n{H(Y n )e{u}^} 

= {Y n eB(x,a)}. 

Hence for such a, the large deviation lower bound for Y n with respect to P n and part (b) 



of Proposition 3.1 yield 



liminf — log P^ r {Y n G G} 
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> liminf — logP^ r {F„ G B(x,a)} 

> liminf — log P n {{Y n G B(x,a)} fl {#„ G {«} (r) }} 

n— >oo a n 

- limsup — log P n {H n G {«} {r) } 

n— >oo <^n 

> liminf — log P n {Y n G B(x,a)} 

-limsup — log P n {H n G {u} {r) } 

> -I(B{x,a)) + J{{u} {r) ) 

> -I(x) + J({w} (r) ) 

= -r(x) + J(W (r) )- 

Now take any x & X such that H(x) ^ u. Since = oo, it follows that 

liminf — \ogP^ r {Y n GG}> — oo = -/"(s) + J({«} (r) ) - J{u). 

n-^oo a n 

We have thus obtained the same lower bound for all x G G. We conclude that 

liminf — log P^ r {Y n G G} > sup{-/ u (x)} + J({u} {r) ) - J(u) 

= -I U (G) + J({u}^)-J(u). 

This completes the proof of the large deviation lower bound (|3.9|) . The proof of Theorem 
5T2| is done. 

The second proof of the large deviation bounds in Theorem |3.2| uses the following 
alternate representation for the rate function: 

r(x) = i({x}nH- 1 ({u})). 

Let F be any closed subset of X. We choose ip to be any function mapping (0, 1) onto 
(0, 1) with the properties that ip(r) > r for all r G (0, 1) and lim r ^oV ; ( r ) = 0- Clearly, as 
r | 0, {u}^ r)) | {u}. We need the limit 

lim i(F n r 1 ^}^")) = i(F n H~ l (u)), 

r— »0 

which follows from routine calculations using the continuity of H and the fact that I u 
is a rate function. The proof of this limit is omitted. The rest of the proof of the large 
deviation upper bound is straightforward. By the approximation property ( p.4|) and the 
continuity of H, for any r G (0, 1) and all sufficiently large n 

P n {{Y n g f} n g {«}«}} < p„{{y; g f} n {# (F n ) g {«}<*•■»}}. 
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Then the large deviation upper bound for Y n with respect to P n and part (c) of Proposition 
yield 

limlimsup — log P^' r {Y n G F} 

< limlimsup — log P n {Y n G [F D # -1 ({«} Wr)) )]} 

r— *0 n— >oo <^n 

- lim liminf — log P n {H n G {«} (r) } 

< -iim/(Fn/r 1 (M Wr)) )) + J(u) 

r— +0 

= -I(Ff]H- 1 (u)) + J{u) 
= -I U (F). 

This is the large deviation upper bound fl3.5|) . 

Now let G be any open subset of X. Again by the approximation property ( |2.4j ) and 
the continuity of H, for any number r~ satisfying < r~ < r and all sufficiently large n 

Wn6G}n{iJ (1 GW w }} 

> P n {{r„ E G} D {H(Y n ) G M (r_) }} 

> P n {Y n eG n^-^intM^)}. 



The large deviation lower bound for Y n with respect to P n and part (c) of Proposition 
yield 

lim lim inf — log P^ r {Y n G G} 

r— >0 n— >oo a n 

> limliminf — logP„{Y n G [G n iH(int{u} (r_) )]} 

r-+0 n— >ao a n 

- lim lim — logP n {P n G M (r) } 

r— >0 n-^oo a„ 



> -lim/(Gn J H- 1 (intM (r_) )) + 

r— »0 



^(intM^ ^ 

> -/(Gnr^ujl + Jfu) 

= -I U (G). 

This is the large deviation lower bound fl3.bp, completing the second proof of the large 



deviation bounds in Theorem |3.2| . The proof of Theorem [T2] is done. 

In Section 2 the large deviation analysis of the canonical ensemble led us to define, 
in terms of the rate function in the corresponding LDP, the set of canonical equilibrium 
macrostates. Analogously, for u G dom J we define, in terms of the rate function I u in 
Theorem |3^, the set of microcanonical equilibrium macrostates 

S u = {x G X : r{x) = 0}. 
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Thus x G B u if and only if I{x) = J{u) and H(x) = u. We next point out that in 
certain models elements of £ u have an equivalent characterization in terms of constrained 
maximum entropy principles. 

Remark 3.4. Equivalent characterization in terms of constrained maximum 
entropy principles. Since J(u) equals the infimum of I over all elements x satisfying 
the constraint H(x) = u, we see that x G S u if and only if x solves the following constrained 
minimization problem: 

minimize I(x) over x G X subject to the constraint H(x) = u. 

Both for the regulariued point vortex model and the Miller- Robert model the rate function 
/ equals a relative entropy, which in turn equals minus the physical entropy. Hence 
for these models the last display gives an equivalent characteriuation of microcanonical 
equilibrium macrostates in terms of a constrained maximum entropy principle. ■ 



Parts (c) and (d) of Theorem TA state several properties of the set Bp of canonical 



equilibrium macrostates. The next theorem gives analogous properties of B u . The second 
of these properties is slightly more complicated than in the canonical case because the 
microcanonical measures P"' r depend on the two parameters n G IV and r G (0, 1). 



Theorem 3.5. We assume Hypotheses \Llj and [2.2| . Foru G dom J the following conclu- 
sions hold. 

(a) B u = {x G X : I u (x) = 0} is a nonempty, compact subset of X . A point x G X 
lies in B u if and only if I(x) = J{u) and H(x) = u; equivalently, if and only if x solves 
the following constrained minimization problem: 

minimize I(x) over x G X subject to the constraint H(x) = u. 

(b) Let A be any Borel subset of X whose closure A satisfies A H B u = 0. Then 
I U (A) > 0. In addition, there exists r G (0, 1) and for all r G (0, r ] there exists C r < oo 
such that 

PY\Xn eA}< C r exp[-a n r(A)/2] ^0 as n -»• oo. 

Proof, (a) B u is a nonempty, compact subset of X because I u is a rate function. The 
equivalent characterizations of x G B u follow from the formula for I u . 

(b) If A PI B u = 0, then for each x G A we have I u (x) > 0. Since I u is a rate function, 
it follows that I U (A) > 0. The large deviation upper bound for the P^ ,r -distributions of 
Y n given in (|3.5|) completes the proof. ■ 



Part (b) of Theorem |3.5| can be regarded as a concentration property of the P^' r - 
distributions of Y n . This property justifies calling B u the set of microcanonical equilibrium 
macrostates. 



Theorem |2J| studies compactness properties of the sequence of P njanj a-distributions 
of Y n and shows that any weak limit of a convergent subsequence of this sequence is 
concentrated on Bp. In the next theorem we formulate an analogue for the microcanonical 
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ensemble, studying compactness and weak limit properties of the P^ ,r -distributions of Y n . 



In the case of the classical lattice gas, a related result is given, for example, in |L2], Lem. 
4.1]. 



Theorem 3.6. We assume Hypotheses £J] and [2.2| . Foru £ dom J the following conclu- 
sions hold. 

(a) For r £ (0, 1), any subsequence of P^ ,r {Y n £ ■} has a subsubsequence P^; r {Y n i £ •} 
converging weakly to a probability measure U u,r on X as n' — > oo. 

(b) There exists r £ (0, 1) such that for all r £ (0, r ] IT' r is concentrated on £ u ; 
i.e., U u ' r {(S u ) c } = 0. Thus if S u consists of a unique point x, then for all r £ (0, ro] the 
entire sequence P^ ,r {Y n £ ■} converges weakly to 5% as n — > oo. 

(c) For any sequence r^ C (0, 1) converging to 0, any subsequence of IF' rfe has a 
subsubsequence converging weakly to a probability measure IP on X that is concentrated 
on S u . 

Proof, (a) Define a* = min ng jvo ra > 0. The exponential tightness estimate in Lemma 
|3TB| implies that for all sufficiently large M £ (0, oo) there exists a compact subset D of 
X such that for all r £ (0, 1) and all sufficiently large n 

P^ r {Y n £ D c } < exp[-a n M/2] < exp[-a*M/2]. (3.10) 

Since M can be taken to be arbitrarily large, this yields the tightness of the sequence 
pu,r^Y n £ •}. The tightness implies that any subsequence of P^' r {Y n £ ■} has a subsub- 
sequence P^ r {Y n t £ ■} converging weakly to a probability measure U u,r on X as n' — > oo 
[Prohorov's Theorem]. This completes the proof of part (a). 

(b) We use the value of ro from part (b) of Theorem |3.5| . As in the proof of Theorem 
[2.5| , in order to prove the concentration property of U u,r , we write the open set (S u ) c as 
a union of countably many open balls Vj such that the closure Vj of each Vj has empty 
intersection with £ u . Let P^,' r {Y n / £ •} U u,r be the subsubsequence arising in the proof 
of part (a) of the present theorem. For r £ (0, ro], part (b) of Theorem implies that 
P^; r {Y n i £ Vj} — > as n' — >• oo, and so 

= liminf Py{Y n , £ Vj) > Yl u > r {Vj}. 

n'—*oo 

It follows that U u ' r {Vj} = and thus that Tl u ' r {(£ u ) c } = 0, as claimed. If S u consists 
of a unique point x, then as in the proof of Theorem [2.5| , one shows that as n — > oo 
P^ ,r {Y n £ •} =>> 8$. This completes the proof of part (b). 

(c) This follows from part (b), Prohorov's Theorem, and the compactness of £ u . The 
proof of Theorem |3.6| is complete. ■ 
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4 Equivalence and Nonequivalence of Ensembles 



In the preceding section we presented, for the microcanonical ensemble, analogues of 
results proved for the canonical ensemble in Section 2. These include large deviation 
theorems and properties of the set of equilibrium macrostates. Such analogues of results 
for the two ensembles point to a much deeper relationship between them. As we will 
soon see, the two ensembles are intimately related both at the level of thermodynamic 
functions and at the level of equilibrium macrostates, and the results at these two levels 
mirror each other. 

Our main results on equivalence and nonequivalence of ensembles at the level of equi- 



librium macrostates are presented in Theorems |474], fO], and [4.8| and are summarized 
in Figure [I]. Definitive and complete, they express, in terms of concavity and other 
properties of the microcanonical entropy, relationships between the sets of canonical and 
microcanonical equilibrium macrostates. The proofs of these relationships are based on 
straightforward concave analysis. Other results in this section explore related issues. For 



example, Corollary |4.7| is a uniqueness result for equilibrium macrostates, Theorem [4.10 



relates the equivalence of ensembles to the differentiability of the canonical free energy, 
and Theorem |4.11| shows that a certain equivalence-of-ensemble relationship implies a 



concavity property of the microcanonical entropy. 

We start our presentation by recalling an elementary result at the level of thermo- 
dynamic functions. The microcanonical entropy is the nonpositive function defined for 
u G WT by 

s(u) = -J{u) = -inf{J(x) : x G X,H(x) = u}. 

We define doms as the set of u G M a for which s(u) > — oo. As shown in ( |3.3j ), the 
canonical free energy (p(/3) can be obtained from s by the formula 

ip((3) = inf {(fru) - a(u)}, (4.1) 

which expresses p as the Legendre-Fenchel transform s* of s. In general, <p = s* is finite, 
concave, and continuous on JR G [Thm. |2.4j (a)1, and s is upper semicontinuous [Prop. 
|3.1| (a)]. If it is the case that s is concave on M' 7 , then concave function theory implies 
that s equals the Legendre-Fenchel transform of <p [[1^, p. 104]; viz., for u G M a 

s( u ) = tp*(u) = b^m*) - vtf)}- (4-2) 

If s is concave on M a , then following standard terminology in the statistical mechan- 
ical literature, we say that the canonical ensemble and the microcanonical ensemble are 



thermodynamically equivalent [28, 33]. As we will see, when properly interpreted, the 
nonconcavity of s at points u G M 7 will imply that the ensembles are nonequivalent at 
the level of equilibrium macrostates for those values of u [Thm. |4.4| (b)1. Further con- 
nections between thermodynamic equivalence of ensembles and equivalence of ensembles 
at the level of equilibrium macrostates are made explicit in Theorem [Ol In particular, 



under a hypothesis on the domains of various functions that is not necessarily satisfied in 



32 



all models of interest, thermodynamic equivalence of ensembles is a necessary and suffi- 
cient condition for equivalence of ensembles to hold at the level of equilibrium macrostates 
[Thm. U(c)]. 

The concavity of s on 1R U depends on the nature of I and H . For example, if / 
is concave on X and H is affine, then s is concave on M a . However, in general the 
concavity of s is not valid. In fact, because of the local mean-field, long-range nature 
of the Hamiltonians arising in many models of turbulence, including the Miller-Robert 
model [Example [0|(b)], the associated microcanonical entropies are typically not concave 
on subsets of M a corresponding to a range of negative temperatures. 

In order to see how concavity properties of s determine relationships between the sets 
of equilibrium macrostates, we define for u G M a the concave function 

s**(u) = /nfJ(M - s*(P)} = ^M^ u ) - <P(P)Y 

Because of ( |4.2| ), it is obvious that s is concave on M a if and only if s and s** coincide. 
Whenever s(u) > — oo and s(u) = s**(u), we shall say that s is concave at u. 

Now assume that s is not concave on M a . Since for any u G doms and all (3 G 3R a we 
have s(u) < (/3,u) - s*(/3), it follows that for all u G M a 

s(u) < inf {(/3,u) - s*(f3)} = s**(u). (4.3) 

In addition, if / is any upper semicontinuous, concave function satisfying s(u) < f(u) for 
all u G R a , then for all (3 G M a s*(J3) > f*(j3) and thus s**(u) < f**(u) = f(u) for all 
u G M a . It follows that if s is not concave on M a , then s** is the upper semicontinuous, 
concave hull of s; i.e., the smallest upper semicontinuous, concave function on M a that 
majorizes s. In particular, if s(u) > — oo, then s**(u) > — oo; thus doms C doms**. 

Since s** is an upper semicontinuous, concave function, we can introduce a basic 
concept in concave function theory that will play a key role in our results on equivalence 
and nonequivalence of ensembles. For u G doms** the superdifferential of s** at u is 
defined as the set ds** (u) consisting of (3 G M a such that 

s** (w) < s** (u) + (/3,w- u) for all w G R a ; (4.4) 

any such j3 is called a supergradient of s** at u. The effective domain of the superdiffer- 
ential of s** is defined to be the set dom<9s** consisting of u G M a for which ds**(u) is 
nonempty. It can be shown that [55, p. 217] 



ri(doms**) C dom<9s** C doms**, (4-5) 

where for A a subset of M a ri(dom A) denotes the relative interior of A. These relationships 
imply that ds** (u) is nonempty for u G doms** except possibly for u in the relative 
boundary of doms**. 

The purpose of this section is to investigate, in terms of concavity properties of s and 
s**, relationships between the set Ep of canonical equilibrium macrostates and the set E u 
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of microcanonical equilibrium macrostates. We recall that for (3 G IR a and u G doms 
these sets are defined by 



£p = {xeX: Ip{x) = 0} 



x G X : I(x) + (P,H(x)) = ini :{I{y) + (P,H(y))} = cp(J3) 

yex 



and 



f = {i6^: r (z) = 0} = {x G X : = it, /(x) = -s(u)}. 

J/3 is the rate function in the LDP for the canonical ensemble [Thm. |2.4| , and I u is the 
rate function in the LDP for the microcanonical ensemble [Thm. |3.2|| . As the sets of 
points at which the corresponding rate functions attain their minimum of 0, both £p for 
(3 G M a and £ u for u G doms are nonempty and compact. It is convenient to extend the 
definition of £ u to all u G M a by defining £ u = for u G M a \ doms. 

First-order differentiability conditions show that relationships between £p and £ u are 
plausible. In fact, the first-order condition for x* G X to be in £p is 

I\x*) + (J3,H\x*)) = 0, (4.6) 

where ' denotes the Frechet derivative and we assume that / and H are Frechet-differentiable. 
The first-order condition for x* G X to be in £ u is also (|4.6|), where (3 is a Lagrange mul- 



tiplier dual to the constraint H(x*) = u. In order to see the precise relationships between 
£ u and £p, we need a more detailed analysis. 

As we will see, there are three possible relationships that can occur between £ u and 
£p. If for a given u G doms there exists (3 G M a such that £ u = £p, then the ensembles 
are said to be fully equivalent or that full equivalence of ensembles holds. If instead of 
equality £ u is a proper subset of £p for some (3 G M a , then the ensembles are said to be 
partially equivalent or that partial equivalence of ensembles holds. It may also happen 
that £ u fl £p = for all (3 G M a . If this occurs, then the microcanonical ensemble is said 
to be nonequivalent to any canonical ensemble or that nonequivalence of ensembles holds. 
It is convenient to group the first two cases together. If for a given u there exists (3 such 
that either £ u equals £p or £ u is a proper subset of £p, then the ensembles are said to be 
equivalent or that equivalence of ensembles holds. 

The probabilistic role played by £ u and £p should be kept in mind when interpreting 
these relationships. According to part (c) of Theorem |2.4|, for any Borel subset A whose 



closure is disjoint from £p, P n ,a n (i\Xn £ ^4} ~~ ► 0. Theorem 2J3 refines this by showing that 
convergent subsequences of P n ,a n p{Y n G •} have weak limits with support in £p. Theorems 
|3~5l and [D| do the same for the microcanonical ensemble. Only when £p = £ u = {x} 
can we be sure that the two ensembles give the same prediction in the sense of weak 
convergence. A condition implying these equalities is given in Corollary |4.7| . 

A key insight revealed by our results is that the set £ u of microcanonical equilibrium 
macrostates can be richer than the set £p of canonical equilibrium macrostates. Specif- 
ically, every x G £p is also in £ u for some u, but if the microcanonical entropy s is not 
concave at some u, then any x & £ u does not lie in £p for any j3 (nonequivalence of ensem- 



bles). This verbal description is made precise in Theorems [O] and [4.6| , while Theorems 
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O] and [4.8| give necessary and sufficient conditions for equivalence of ensembles to hold. 



The content of Theorem [16] is summarized in Figure 0(a). The contents of Theorems 4.4 



and [4.8| are summarized in Figure |T](b) . 



Theorem [4.4| gives a geometric condition that is necessary and sufficient for equivalence 



of ensembles to hold. We define C to be the set of u G lR a for which there exists a 
supporting hyperplane to the graph of s at (u,s(u)). In symbols, 

C = {u G JRT : 3(3 G R u 3 s(w) < s(u) + (p,w- u) for all w G M 7 }. (4.7) 

If u G C, then the /? appearing in this display is a normal vector to the supporting 
hyperplane. According to part (a) of Theorem |4.4j , for a particular u G doms equivalence 
of ensembles holds if and only if u G C. According to part (b) of the theorem, for a 
particular u G dom s nonequivalence of ensembles holds if and only if u G" C. 

Theorem refines part (a) of Theorem [4.4| by giving a geometric condition that is 



necessary and sufficient for full equivalence of ensembles to hold. We define T to be the set 
of u G M a for which there exists a supporting hyperplane to the graph of s that touches 
the graph of s only at (u, s(u)). In symbols, 

T = {u G R a : 3(3 G R a 3 s(w) < s(u) + ((3, w - u) for all w ^ it}. (4.8) 

Clearly, T is a subset of C, which is the set of u for which equivalence of ensembles holds 
[Thm. fO](a)]. According to Theorem 4.8 , for a particular u G doms full equivalence of 



ensembles holds if and only if u G T. 

Before proving any results on the equivalence and nonequivalence of ensembles, we 
point out an alternate representation of C that will elucidate the connection between 
these results and concavity properties of s and s** . In general s is not concave on M a . 
According to part (b) of Lemma O , C equals the set of u G dom ds** at which s is 



concave; i.e., the set of u G dom<9s** such that s{u) equals the value at u of the concave 
function s** . It follows from part (b) of Lemma f4.1| that if s is not concave at some 
u G doms, then u G" C and so nonequivalence of ensembles holds [Thm. |4.4j (b)]. 

It is easy to find a sufficient condition on s** for full equivalence of ensembles to hold. 
Suppose that for some u G M a s(u) = s**(u) and that there exists (3 G M a such that 

s** (w) < s** (u) + {J3,w- u) for all w ^ u; (4.9) 

i.e., the inequality ([4.4|) defining (3 G ds** (u) holds with strict inequality for all w ^ u. 
Since s{w) < s**(w), it follows that 

s(w) < s(u) + (P,w — u) for all w ^ u. (4-10) 



That is, u lies in T, which according to Theorem is the subset of M a for which full 
equivalence of ensembles holds. If, for example, s** is strictly concave in a neighborhood 
of u, then ( |4.9|) holds for any (3 G ds** (u) and thus we have full equivalence of ensembles. 

In order to find a sufficient condition on s** for partial equivalence of ensembles to 
hold, let u be a point in M a such that s** is affine in a neighborhood of u. Then except 
in pathological cases, for any (3 G M a the strict inequality ( |4.10| ) cannot be valid for all 
w 7^ u, and so partial equivalence of ensembles holds. 
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Part (b) of the next lemma gives the alternate representation of C to which we referred 
three paragraphs earlier. This representation involves the set 

T = {ueR a : s(u) = s**(u)}. 



Lemma 4.1. (a) For u and (3 in M u , s(w) < s(u) + ((3, w — u) for all w G M a if and 
only if both s(u) = s**(u) and (3 G ds**{u). 
(b) C = T n domds**, and C cm doms. 



Remark 4.2. It is not difficult to refine the second assertion in part (b) of this lemma 
by showing that 

T n ri(dom s) C C = T n dom ds** C T n dom s. 

This relationship implies that, except possibly for relative boundary points of doms, C 
consists of u G doms for which s(u) = s**(u). According to Theorem |4.4| , equivalence of 
ensembles holds for a particular u G dom s if and only if u G C. Combining this with the 
observation in the preceding sentence, we see that, except possibly for relative boundary 
points of doms, equivalence of ensembles holds for u G doms if and only if s(u) = s**(u). 

Proof of Lemma [4.1| . (a) We start the proof by first assuming that s(w) < s(u) + 
(P,w — u) for all w G M u . It follows that u G doms and that (/3, u) — s{u) < (f3,w) — s(w) 
for all w G M a . Therefore 

(P,u) - s(u) = inf {(p,w) - s(w)} = <p(P). 

Since s**(w) = inf 7g j R a{(7, w) —^(7)} < (P, w) —(p(/3), the last display and the inequality 
s(u) < s**(u) imply that for all w G M a 

s**(w) < ([3,w)- ( p(l3) = s(u) + (P,w)-(P,u) 
< s**(u) + {/3,w-u). 

Thus (3 G ds** (u). Setting w = u yields s(u) = s**(u). 

Now assume that s(u) = s**(u) and that /3 G ds** (u); thus for all w G M 7 

s**{w) < s**{u) + (J3,w-u) = s{u) + {/3,w-u). 

Since s(w) < s**(w) for all w G M a , it follows that for all w G M a 

s(w) < s(u) + (j3,w — u). 

This completes the proof of part (a). 

(b) The first assertion is an immediate consequence of part (a). As mentioned in the 
proof of part (a), if u G C, then u G doms. We conclude that C C Tridoms, as claimed. 



The next lemma will facilitate the proofs of a number of our results on the equivalence 
and nonequivalence of ensembles. Part (b) refines one of the conditions in part (a), 
substituting a weaker hypothesis that leads to the same conclusion. 
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Lemma 4.3. For u and (3 G IR a the following conclusions hold. 

(a) The inequality s(w) < s(u) + ((3, w — u) is valid for all w G M a if and only if 
£ u ^0 and£ u C £ p . 

(b) If£ u nSp^Q, then s(w) < s(u) + (/?, w - u) for all w G M a . 

Proof. We first prove that if s{w) < s{u) + ({3, w — u) for all w G IR a , then £ u ^ and 
£ u C £/3. The hypothesis implies that u G doms and that (/3,u) — s(u) < (/3,w) — s(w) 
for all w G M a . Therefore 

((3,u) - s(u) = M{{P,w) - s(w)} = <p((3) = M{{P,H(y)) + I(y)}. 

The fact that u is an element of doms implies that £ u ^ 0. Let x be an arbitrary element 
in £ u . Since H(x) = u and I(x) = —s(u), the display implies that 

03,H{x)) + I(x) = MW,H{y)) + I(y)} 

and thus that 16^. Since x is an arbitrary element in £ u , it follows that £ u C £p. 

In order to complete the proof of part (a), it suffices to prove part (b). Thus suppose 
that £ u fl £ p 7^ and let x be an arbitrary element in £ u D £p. Since £ u ^ 0, we have 
u G doms. In addition, since H(x) = u, I(x) = —s(u), and 

03, H(x)) + I(x) = inf {(13, H(y)) + I(y)} = <p(/3), 

y&X 

it follows that for all w G M a 

(l3,u)-s(u) = <p(P)= inf {(P,w')-s(w')}<(P,w)-s(w). 

Therefore s(w) < s(u) + (/3, w — u) for all w G M a , as claimed. ■ 



The next theorem is our first main result. Part (a) states that for a particular u G 
dom s equivalence of ensembles holds if and only if u G C. In Theorem |4.9| we make explicit 
the connection between part (a) and the relationship between thermodynamic equivalence 
of ensembles and equivalence of ensembles at the level of equilibrium macrostates. Part 
(b) of the next theorem states that for a particular u G dom s nonequivalence of ensembles 
holds if and only if u G" C. In particular, if s is not concave at some u G dom<9s**, then 
the ensembles are nonequivalent at the level of equilibrium macrostates. Theorem [4.4| was 



inspired by, and greatly improves upon, the presentation on pages 857-859 of [JT9 
treats the regularized point vortex model. While part (b) of Theorem 
part (b) of Lemma 5.1 in [| 
more explicit. 



which 



is related to 



our Theorem 4.4 makes the nonequivalence of ensembles 



Theorem 4.4. We assume Hypotheses |2.1| and\2?£. For u G doms the following conclu- 
sions hold. 

(a) u G C if and only if £ u C £p for some f3 G M a . 

(b) u G 7 C if and only if£ u r\£ p = ® for all (3 G M° . 
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Proof, (a) This is an immediate consequence of part (a) of Lemma fO|. 

(b) If u ^ C, then for any (3 E M a the inequality s(w) < s(u) + (/3,w — u) does not 
hold for all w E M a . Part (b) of Lemma fL| implies that £ u n £p = for all f3 E fff 7 . To 
show the converse, assume that £ u fl £g = for all /? G iR°" and that u G C. But if w e C, 
then part (a) of Lemma implies that £ u C £g for some /3 G iR CT . This contradiction 
shows that u ^ C , completing the proof. ■ 

In the next proposition we refine part (a) of Theorem [O] by specifying the set of (3 
for which £ u C £@. 



Proposition 4.5. We assume Hypotheses ^J] and [2.2| . T/ien /or u E C , £ u E £p for all 

(3 E ds**(u) and £ u n ^ = /or a// /? ^ ds**(u). 



Proof. For u E C, part (b) of Lemma ^]T] implies that s(u) = s**(u) and ds**(u) ^ 0. 
If (3 E ds**(u), then part (a) of the same lemma implies that s(w) < s(u) + (J3,w — u) 
for all w E M a . Part (a) of Lemma [13] then implies that £ u C £p. This proves the 
first half of the proposition. On the other hand, if (3 ^ ds** (u), then it is not true that 
s(w) < s(u) + (j3,w — u) for all w E M a [Lem. |4.1] (a)]. It follows from part (b) of Lemma 
that £ u n £p = 0. ■ 



Theorem |4.4| considers u E doms, proving that partial or full equivalence of ensembles 
holds if and only if u G C. The next theorem is our second main result. It shifts focus 
from u E doms to f3 E M a ', proving that every set £$ of canonical equilibrium macrostates 
is a disjoint union of £ u for u in a particular index set that depends on j3. 

Theorem 4.6. We assume Hypotheses pTT| and [2.2|. Then for all f3 E M a , H(£p) C doms 
and 

^= U r - 

The sets £ u , u E H{8p), are nonempty and disjoint. 

Proof. Let x be an arbitrary element in £p and define u = H(x). Since Ip(x) = 0, we 
have 

I(x) + (/3,H(x)) = mUl(y) + {(3,H{y))} < oo, 

and so s(u) > —I(x) > — oo. Thus u E doms. Because x is an arbitrary element in £p, 
this proves that H(£p) C doms. Since u E doms, £ u can be characterized as the set of 
x E X satisfying H(x) = u and I(x) = —s(u). 

We now prove that x E £ u . Since x E £p, it follows that for any y E X 

I(x) + (J3,u) = I(x) + (0, H{x)) < I(y) + (i3,H(y)), 

and thus for any y E X satisfying H(y) = u, we have I(x) < I(y)- This implies that 

I{x) < M{I(y) : y E X,H(y) = u} = -s(u) < I(x), 
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and so I(x) = —s(u). It follows that x G £ u . Since x is an arbitrary element in £p, we 
have shown that 

£pC U r - 

ueH(Sg) 

In order to prove the reverse inclusion, we show that for any u G H(£p) we have 
£ u C £p. Any such u has the form u = H(y) for some y G £p. From our work in the 
preceding two paragraphs we know that u G doms and y G £ u . Thus y G £ u H £' / g. Since 
£" fl ^ 7^ 0, it follows from Theorem |4.4| that £ u C £p. This completes the proof of the 



display in the theorem. 

The sets £ u ,u G H(£p), are nonempty since any such u lies in doms. The sets are 
also disjoint since for u 7^ u', x G £ u fl £ u implies that H(x) equals both u and u' . The 
proof of the theorem is complete. ■ 

The following useful corollary states that when £p consists of a unique point x, then 
with u = H(x), £ u consists of the unique point x. This follows from Theorem since 
H(£p) = {H(x)}. The corollary sharpens the result on page 861 of ]nj, which needs the 
additional hypotheses that s is strictly concave and essentially smooth in order to reach 
the same conclusion. 

Corollary 4.7. Suppose that £p = {x} for some [3 G M a . Then £ u = {x}, where 
u = H(x). 

We now turn our attention to a criterion for full equivalence of ensembles, which is 



stated in terms of the set T defined in (|4.8| ). Part (a) of Theorem [4.4| states that for 
a particular u G doms equivalence of ensembles holds if and only if u G C. The next 
theorem refines this by showing that full equivalence of ensembles holds if and only if 
u G T . Part (a) gives the sufficiency and part (b) the necessity. 



Theorem 4.8. We assume Hypotheses [2.1| and [2.2| . The following conclusions hold. 

(a) Ifu&T, then there exists (3 G ds**(u) such that £ u = £p. 

(b) IfueC\T, then £ u C£p for all f3 G ds**(u) and £ u n£p = for all (3 £ ds**(u). 

Proof, (a) If u G T, then there exists (3 G M a such that s(w) < s(u) + ((3, w — u) for all 
w 7^ u. Part (a) of Lemma fO] implies that £ u C £p. Suppose that £ u is a proper subset 
of £p. Then Theorem |4.6| implies the existence of u' 7^ u such that £ u 7^ and £ u C £p, 
and part (a) of Lemma yields 



s(w) < s(u) + (J3,w- u) for all w G M a . 

Setting w = u and using the fact that s(V) < s(u) + {(3, u' — u), we see that 

s(u) < s(u') + (p,u- v!) < s{u) + (J3, u' -u) + ((3, u-u') = s(u). 

This contradiction shows that the assumption that £ u is a proper subset of £p is false. 
The proof of part (a) is complete. 
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(3 G lR a 



£p= U £U 

[Thm. 



(a) For (3 G M a , any x G £p lies in some £ u . 

u G doms 




Full Equivalence: 

3p G <9s**(w) s.t. 

= ^ 

[Thm. Q(a)] 



Partial Equivalence?^ 



V/3 G 9s** (u) 
[Thm. O(b)] 



Nonequivalence: 

£ u n£f3 = (b 

V/3 G iR a 
[Thm. 0(b)] 



(b) There are three possibilities for u G doms. The two branches on the left lead to 
equivalence results, whereas the other branch leads to a nonequivalence result. The 
sets C and T are defined in ( |4.7| ) and (|4.8| ). 



Figure 1: Equivalence and nonequivalence of ensembles. 



(b) For u G C \ T, Proposition [15] implies that £ u C £p for all /3 G <9s**(u) and 
£" Pi £p — for all (3 G" ds**(u). We now show that for any /3 G ds**(u), £ u is a proper 
subset of £p. Since £ u C £p, part (a) of Lemma [13] implies that s(w) < s(u) + ({3, w — u) 
for all w G M a . Since it G" T, there exists u' ^ u such that s(w') = s(m) + u' — it). 
Then for all w G iR CT 



s(iu) < s(u) + (p,w -u) = s(u') + (/?, w - it 7 )- 

It now follows from part (a) of Lemma |4.3| that £ u ' ^ and £ u ' C £p. Thus is a proper 
subset of £p, as claimed. ■ 



We recall that thermodynamic equivalence of ensembles is said to hold when s is con- 
cave on M a . The next theorem addresses the issue of how thermodynamic equivalence of 
ensembles mirrors equivalence of ensembles at the level of equilibrium macrostates. Part 
(a) shows that thermodynamic equivalence is a sufficient condition for macroscopic equiv- 
alence to hold for all u G dom<9s. Since when s is concave on iR CT we have ri(doms) C 
dom<9s C doms, it follows that thermodynamic equivalence is a sufficient condition for 
macroscopic equivalence to hold for all u G doms except possibly for relative boundary 
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points. Part (b) proves a partial converse to (a). In part (c) we point out that thermo- 
dynamic equivalence is equivalent to macroscopic equivalence under an extra hypothesis 
on the domains of s, s**, and ds**. The proof of the theorem follows readily from our 
previous results. The theorem is related to Lemma 6.2 and Theorem 6.1 in [E3 . 

Theorem 4.9. (a) Assume that s is concave on IR° . Then for all u G dom<9s, £ u C £p 
for some (3 G ds(u). Thus, thermodynamic equivalence of ensembles implies equivalence 
of ensembles at the level of equilibrium macrostates for all u G dom9s. 

(b) Assume that dom s = dom s** and that for all u G dom s there exists (3 G M a such 
that £ u C £p. Then s is concave on M a . Thus, under the hypothesis that dom s = dom s** , 
equivalence of ensembles at the level of equilibrium macrostates for all u G dom s implies 
thermodynamic equivalence of ensembles. 

(c) Assume that doms = doms** = dom ds** . Then thermodynamic equivalence of 
ensembles holds if and only if the ensembles are equivalent at the level of equilibrium 
macrostates. 



Proof, (a) If s is concave on M a , then s = s** on M a and C = dom ds** = dom ds [Lem. 
|4.1| (b)1- Part (a) of Theorem |4.3| completes the proof of part (a). 

(b) The hypotheses imply that any element of doms is an element of C, which in 
turn is a subset of T = {u G M a : s(u) = s**(u)}. It follows that s and s** agree on 
doms = doms** and thus that s is concave on M a . 

(c) This follows from parts (a) and (b). ■ 



With Theorem |4. 9| the presentation of the main results in this section is complete. We 
end this section by giving two additional theorems in which we explore further relation- 
ships involving £p, £ u , and the thermodynamic functions <p and s. 

In part (a) of the next theorem we refine Theorem £0] by proving that £p = LLea^f/^nr 
where d(p(f3) denotes the superdifferential at (3 of the concave function (p and, as intro- 



duced in Lemma O] , r = {u G IR" : s(u) = s**(u)}. This in turn allows us to give, in 
part (b), a necessary and sufficient condition for the differentiability of (p at a point (3. 
Part (c) is a special case of part (b). 



Theorem 4.10. We assume Hypotheses [2.1| and \2.% The following conclusions hold. 

(a) For all f3 G R a 

£p= |J £ u = |J £ u - 

ueH(S f3 ) ued<p(/3)nr 

(b) ip is differentiable at (3 if and only if both £@ = £ u for some u and d(p{(3) C V . 

(c) If s is concave on M a , then ip is differentiable at (3 if and only if £p = £ u for some 

u. 



Proof, (a) It follows from part (a) of Lemma PO and part (a) of Lemma PO that 



£ u ^ and £ u C £ p if and only if s{u) = s**(u) and (3 G ds** (u). 
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Since /3 G ds** (u) if and only if u G ds*(/3) = <9</?C#) g5], p. 218], it follows that 

£ u ^ and £ u C £p if and only if u G 8<p(J3) H T. (4.11) 

Thus 

U ^ C £ fi- 

We complete the proof of part (a) by showing that we have equality in this display 
By Theorem ^4.6| £$ is a disjoint union of £ u for u G H(£p) C dom s. Hence for each 
u G H(£p), £ u ^ and £ u C ^. Thus (pID implies that H(£ p ) C dcp(P) n T. We 
conclude that 

U ^c^= u ^ c U £U > 

«e<9^(/3)nr ueH(£ ) ued<p(/3)nr 

and therefore U uG <W)nr ^ = fy- 

(b) We first assume that <p is different iable at /3. Since by part (a) dip(f3) fl T 7^ 
for any (3, the differentiability of <p at (3 implies that d(p((3) = {Vip(/3)} C T and that 
£p = £ v ^\ We now assume that £p = £ u for some u and dip(J3) C T. Since part (a) 
implies that dip(/3) f)T = {u}, we conclude that dcp(/3) = dip(/3) fl T = {u} and therefore 
that if is differentiable at (3. 

(c) This follows from part (b) since the concavity of s on M cr implies that T = M cr , 
and so d(p(f3) C T is always true. ■ 



The next theorem is the final result in this section. Under the hypothesis that s is 
concave on M a , part (a) gives a simpler form of the representation in part (a) of Theorem 
4.10| . Part (b) is a partial converse of part (a). 



Theorem 4.11. We assume Hypotheses [2.1| and |2Jj . The following conclusions hold. 

(a) Assume that s is concave on M' 7 . Then for all (3 G M a 

£ ?= U £U = U £U - 

ueH(£p) uedtp(p) 

(b) Now assume that for all (3 G M a 

£ p= U £U - 

Then s is a finite concave function on any convex subset o/ri(doms). 

Proof, (a) Since s is concave on iR°", T equals lR a and thus d(p(/3) fl T = dcp(/3) for all 
(3 G M cr . Hence part (a) follows from part (a) of Theorem |4.10| . 

(b) Since by definition £ u = for all u G" dom s, it follows from the hypothesis in part 
(b) and from part (a) of Theorem |4.10| that for all (3 G 3R a 

£ P = |J £ u = |J £ u - 

«ed!/>(/3)ndom s u£dtp(f3)nT 
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Thus dip(fl) fl doms = d(p(f3) fl T. Taking the union over all (3 G M u yields 

|J d<p((3)ndoms = \J d<p(fl) D T C V. 



By standard duality theory for upper semicontinuous, concave functions on M a Jl5|, p. 
218], U/3 eKCT d<p(p) = domds**. Thus 

(domds**) n (doms) C T. 

Since ri(doms) C ri(doms**) C dom<9s**, we conclude that ri(doms) C T and therefore 
that s is concave on any convex subset of ri(doms). The proof of the theorem is complete. 



In the next section we extend the large deviation theorems in Sections 2 and 3 and 
the duality theorems in the present section to the study of mixed ensembles. 
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5 Mixed Ensembles 



In broad terms the canonical ensemble differs from the microcanonical ensemble by the 
manner in which the dynamical invariants are incorporated in the respective probability 
measures: exponentiation in the former ensemble and conditioning in the latter ensemble. 
In Section 5.1 we define two classes of mixed ensembles, a mixed canonical- microcanonical 
ensemble and a mixed microcanonical-canonical ensemble, which differ only in the order 
in which the exponentiation and the conditioning are performed. In part (b) of Theorem 
|5.1.1| we show that with respect to both of these ensembles the hidden process Y n satisfies 
the large deviation principle with the same rate function. Hence the sets of equilibrium 
macrostates for both of these ensembles are the same. In Section 5.2 we present complete 
equivalence and nonequivalence results relating the sets of equilibrium macrostates for 
the mixed and the pure canonical ensembles. In Section 5.3, we do the same for the 
sets of equilibrium macrostates for the mixed and the pure microcanonical ensembles. 
These results will be applied in future work to a number of problems, including soliton 
turbulence for the nonlinear Schrodinger equation [17]. 



5.1 Properties of the Mixed Ensembles 



The definitions of the mixed ensembles involve quantities introduced in Hypotheses BTT 



and |2.2| . We shall use the notation Can(ff n ; P n )p to denote the canonical ensemble P n ,p, 
which is defined in fl2.1| ), and the notation Micro(ff n ; P n ) u ' r to denote the microcanonical 
ensemble P^' r , which is defined in ( |3.4j) . The LDP's for Y n with respect to the canonical 
ensemble and with respect to the microcanonical ensemble are given in Theorems and 
|3~2"| , respectively. The respective rate functions are 

Ip(x) = I(x) + (/3,H(x)) - inf {I(y) + (J3,H(y))} 

and for u G dom J 

\ oo otherwise. 
In the sequel we shall use the following alternate formula for I u : 

I u {x) = I{{x} n - J{u). 

Analogous formulas will arise in the study of the mixed ensembles. 

In order to introduce the mixed ensembles, we assume that a > 2. Let r be an integer 
satisfying 1 < r < a and consider decompositions of H n and of H defined as follows: 

H n = (H n , H%), where if* = (H n>1 , . . . , ff„ jT ) and H% = (H njT+1 , . . . , if„ iCT ), 

ff = (if 1 , if 2 ), where if 1 = (ff 1; . . . , H T ) and if 2 = (ff r+1 , ...,H ff ). 
Writing (3 = (J3 1 , (3 2 ) G IR T x JP a ~ T and u = (u\u 2 ) G IR T x R a - T , we define 

Can(ff 1 ,ff 2 ;P n ) /3lj/32 (^) = Can(if n ; P n )p(du) 

= Zn l p 2) exp[-{p\Hfo)) - ((3\Hltu))]P n {dw), 
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where Z,^/? 1 ,/? 2 ) = Z n (f3), and we define 

Micro(^, Hi- P n ) ul > u2 ' r (duj) = MicTo(H n ;P n ) u > r (du) 

The function J(u) = inf{/(x) : x G X,H(x) = u} plays a key role in the large deviation 
analysis of the microcanonical ensemble. We rewrite this function as 

J{u\u 2 ) = M{I(x) : x G X, H^x) = u\ H 2 (x) = u 2 }. (5.1.1) 

The innovation of the present subsection is to consider the asymptotic properties of 
two mixed ensembles, both at the level of thermodynamic functions and at the level 
of equilibrium macrostates. We define a mixed canonical-microcanonical ensemble by 
replacing the measure P n in the canonical ensemble Can(if P n )/3i by the microcanonical 
ensemble Micro(i^ ; p n y r For u 2 e j^-t and pi e iR r ) the resu i t i ng 

measure is given 

by 

Can(^; Micro(# 2 ; P n f^ du ) 

= " ex P [-((3\Hl(u))]P n (du\H 2 G K} (r) ), 



where 



Z n (J3\{u*}U 

Z n (P\ = / exp[-</3\ Hl(u))\ P n {MHl € {u 2 } 

Jn n 



2\(r)> 



By a similar verification as in the paragraph after Proposition |3.1| , the microcanonical en- 
semble Micro(if^; P n ) u ,r , and thus this mixed ensemble, are well defined for all sufficiently 
large n provided u 2 lies in the domain of 

J 2 {u 2 ) = M{I{x) :xeX, H 2 {x) = u 2 }. (5.1.2) 

In an analogous way, we define a mixed microcanonical-canonical ensemble by replac- 
ing the measure P n in the microcanonical ensemble Micro(if^; P n ) u ,r by the canonical 
ensemble Can(if n ; P n )pi. For [3 1 G M T and u 2 G M a ~ T , the resulting measure is given by 

Micro(^ ;C an(^;P n ) /31 ) u2 ' r (^) = Q n ^(du\H 2 G K} (r) ), 

where 

QnA^) = zjpij exp[-(/3\Hl(uj))]P n (duj). 

This mixed ensemble is well defined for all sufficiently large n provided u 2 lies in the 
domain of the function J pi that stands in the same relationship to the mixed ensemble 
as the function J in ( p. 1.10 stands to the microcanonical ensemble. Since J is defined in 
terms of J, which is the rate function in the LDP for Y n with respect to P n , J pi is defined in 
terms of the rate function for Y n with respect to the canonical ensemble Can(if P n )a n/ 3 1 - 
By Theorem O, this rate function is given by 

V(x) = I(x) + (P\H\x)) - M{I(y) + (P\H\y))}. 
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It follows that 

Jpi{u 2 ) = w£{Ip(x) : x £ X,H 2 (x) = u 2 } 

= inf{I(x) + (J3 1 , H\x)) :xeX, H 2 (x) = u 2 } (5.1.3) 

By the discussion earlier in this paragraph, the mixed ensemble Micro (H 2 ; Can (if*; P n )^) u ' 
is well-defined for all sufficiently large n provided u 2 lies in the domain of J pi. Since H 1 (x) 
is finite for all x £ X, u 2 £ dom Jp\ if and only if u 2 £ dom J 2 . By the same proof as that 
of Proposition |3.1| , with respect to P n , the sequences H 2 (Y n ) and H 2 satisfy the LDP on 
M a ~ T with rate function J 2 . As a consequence, dom J 2 is nonempty as is domJgi. 
We recall from Section 4 that 

s(u) = —J{u) = -inf{/(x) : x £ X, H(x) = u] 

defines the micro canonical entropy and that its Legendre-Fenchel transform gives the 
canonical free energy. Both functions appear in relationships involving £p and £ u that 
appear in that section. In an analogous way, for (3 1 £ FT and u 2 £ iR CT_r , we define the 
entropy with respect to the mixed ensemble Micro (if 2 ; Can(if^; -Pn)a n /3 1 )" ' r ^° be 

spi(u 2 ) = -Mu 2 ). (5.1.4) 

This entropy and the associated free energy will appear in the results on equivalence and 
nonequivalence of ensembles to be given in Section 5.2. 

In order to complete the definitions of the various ensembles, we also consider the pure 
ensembles 

Can(Hl;Can(H 2 ;P n )^)px and Micro(f^; Micro(# 2 ; P n ) u2 ' r )" V , 

which are defined similarly as above. We omit the simple calculation showing that for all 
n and r 

Can(^;Can(^;P n )^.V(tL;) = Ctm{HlH 2 n ;P n )p^{du) (5.1.5) 

and 

Micros Micro(# 2 ; P„)" 2 'T V (<M = Micro(^, H 2 n] P n ) ul > u2 > r (dw). (5.1.6) 

On the other hand, for all n and r the mixed canonical-microcanonical ensemble and the 
mixed microcanonical-canonical ensemble are different. In the next theorem we record the 
LDP's satisfied by Y n with respect to the various ensembles introduced in this subsection. 
The pleasant surprise is that although the two mixed ensembles are different for all n and 
r, with respect to each of them, with (3 1 replaced by a n f3 l , Y n satisfies the LDP with the 
identical rate function. 

Before stating the theorem, we define the rate functions for each ensemble. For (3 = 
(/3 x ,/3 2 ) £ M T x M a ~ T , u 2 £ dom J 2 , and u = (m 1 ,^ 2 ) £ dom J, we define the following 
functions mapping X into [0, oo]: 

I^(x) = I(x) + ((3\H 1 (x)) + ((3 2 ,H 2 (x)) (5.1.7) 
-mf{I(y) + (P\H\y)) + (p 2 ,H 2 (y))}, 
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and 



lf x {x) = L({x}n(H 2 )-\{u 2 })) + ((3\H\x)) (5.1.8) 
- inf{/(y) + (P 1 , H\y)) : y G X, H 2 (y) = u 2 }, 

I ul < u2 (x) = I({x} n (H 1 )- 1 ^ 1 }) n (H 2 )- 1 ^ 2 })) - J(u\ ii 2 ) . (5.1.9) 



Theorem 5.1.1. We assume Hypotheses gl] and^. For (/3\/3 2 ) E M T x M a ~ T the 
following conclusions hold. 

(a) W^t/i respect to the canonical ensemble Can(P*, if 2 ; -Pn) a „/3 1 ,a„/3 2 ; satisfies the 



LDP on X with rate function Jgi^a gu>en m ( |5.1.7|) . 

(b) Ta/ce w 2 G dom J 2 [see Q5.1.2I) ]. Po£/i u^/i respect to the mixed canonical-microcanonical 
ensemble Can(P*; Micro(P 2 ; P n ) u ,r ) an /3 1 Wi/i respect to the mixed microcanonical- 
canonical ensemble Micro(P 2 ; Can(P^; P„) an( gi) u2 ' r ; F n satisfies the LDP on X, in the 
double limit n — > oo and r — > 0, mtt rate function 1% given in ( |5 . 1 . 8|) . 



(c) Ta/ce u 



it 



1 m 2 ) G dom J [see (|5.1.1|) 1. With respect to the microcanonical en- 
1 ,r , Y n satisfies the LDP on X , in the double limit n — > oo 



semble Micro(if P 2 ; P n ) u 
and r — > 0, with rate function I u ,u given in ( |5.1.9| ) 



Proof. Part (a) is proved in Theorem |2.4| , and part (c) is proved in Theorem |3.2| . In 
part (b) we first prove the LDP for Y n with respect to Micro(P 2 ;Can(P^; P n ) an/3 i) u ,r . 
Theorem ^]4| implies that with respect to Can(P*; P n ) a „/3 1 , Yn satisfies the LDP with rate 
function 

V(x) = I(x) + (f3\H\x)) - inf{/(y) + (J3\ H\y)}. 

yex 

With P n replaced by Can(P*; P n ) an(3 i and J replaced by Ipi, Theorem ^T2] guarantees 
that if u 2 G dom Jgi = dom J 2 , then with respect to Micro (P 2 ; Can (P*; P n )a„/3 1 ) u ' r "> Y n 
satisfies the LDP, in the double limit n — > oo and r — > 0, with rate function 

(I if(x) = V({*} n (H 2 )-\{u 2 }))-mi{L^(y) : y G X,H 2 (y) = u 2 }. 

Substituting the definition of Ipi, we see that 



J({x}n(P 2 )- 1 (K})) + (/3 1 ,P 1 (a;)) 
- inf{%) + (P\ H\y)) : y G X, H 2 (y) = u 2 }. 



This is the function P^ defined in ( 5.1.8|) . We have proved that with respect to Micro(P 2 ; 
Can(pi; 

Pi)a„/3 1 ) 11 ' r > Y n satisfies the LDP, in the double limit n — > oo and r — > 0, with 

2 

rate function P x . 

We next consider the LDP for Y n with respect to Can(P*; Micro(P 2 ; P n ) u2 ' r )a 7l /3 1 - 
Since u 2 G dom J 2 , Theorem |3]2] implies that with respect to Micro(P 2 ; P n ) u , Y n satisfies 
the LDP, in the double limit n — > oo and r — > 0, with rate function 



P(x) = /({x}n(P 2 )^(K}))-J 2 (n 2 ). 
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One can easily modify the proof of Theorem |2.4| to handle the situation in which P n 
is replaced by a doubly indexed class of probability measures such as Micro (if 2 ; P n ) u ,r 
with the property that with respect to these measures Y n satisfies the LDP. With this 
modification, replacing P n by Micro(if 2 ; P n ) u ,r and I by I u , we see that with respect 
to Can(if*; Micro(if 2 ; P n ) u ' r ) an/ 3i, Y n satisfies the LDP, in the double limit n — > oo and 
r — > 0, with rate function 

(rV(x) = I u \x) + (P\H\x))-mf{I u \y) + (f3 1 ,H\y))} 

yex 

= I({x}n(HY\{u 2 })) + ((3\H\x)) 

- inf{/(y) + (/3\ if 1 ^)) : y E X, H 2 {y) = u 2 }. 

This is the function Igl defined in ( |5.1.8|) . We have shown that with respect to Can(if^; 
Micro(if 2 ; P n ) u2,r )a n i3 1 i Y n satisfies the LDP, in the double limit n — > oo and r — > 0, with 

2 

rate function . The proof of the theorem is complete. ■ 

In the next two subsections, we consider equivalence and nonequivalence results for 
the ensembles whose LDP's are derived in Theorem |5.1.1| . These results are derived as 
immediate consequences of our work in Section 4, where equivalence and nonequivalence 
results for the canonical and microcanonical ensembles were derived. 

5.2 Equivalence and Nonequivalence of the Canonical and Mixed 
Ensembles 

In this subsection we study, at the level of equilibrium macrostates, the equivalence and 
nonequivalence of the canonical ensemble Can(H^, H^; PrijanP 1 ^^ 2 an d the mixed ensem- 
ble Micro(if2 ; Can(^;P n ) an/3 i)" 2 ' r . The parameters (3 1 , (3 2 , and u 2 satisfy (3 1 G R T , 
(3 2 E ]R a ~ T , and u 2 E dom J 2 , where 

J 2 (u 2 ) = inf{J(x) : H 2 (x) = u 2 }. 

By a similar verification as in the paragraph after Proposition |3.1| , this condition on 
u 2 guarantees that the mixed ensemble is well defined for all sufficiently large n. The 
relationships between the sets of equilibrium macrostates for the two ensembles follow 
immediately from Theorems [4.4| , |4.6| , and with minimal changes in proof. Hence we 
shall only summarize them in Figure |2|. 

By Theorem ^TJ, for (f3\ (3 2 ) E IR T x JRf~\ with respect to Can(^, H 2 ; P n )a n p\a n p 
Y n satisfies the LDP with rate function 

I P ijp{x) = I(x) + ((3 1 ,H\x)) + ((3 2 ,H 2 (x)) (5.2.1) 
- mi UI (y) + (P\H\y)) + (f3 2 ,H 2 (y))}. 

In addition, for (fi 1 ^ 2 ) E M T x dom J 2 , with respect to Micro(if 2 ; Can(if^; P n )a n p 1 ) u2 ' r 
Y n satisfies the LDP with rate function 

if^x) = I({x} n (H 2 )-\{u 2 })) + ((J\H\x))) - Ml, (5.2.2) 
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where 

^(u 2 ) = mf{I(y) + <J3\H\y)) : y G X,H 2 (y) = u 2 }. 

For (3 l G M T , j3 2 G M a ~ T , and u 2 G dom J 2 , we define the corresponding sets of equilibrium 
macrostates 

£(31,02 = {x G A? : Ip,p{x) = 0} 

and 

= {x E X : Ipi(x) = 0} 

= {x G X : # 2 (a;) = u 2 , 1(x) + ((3\ H\x)) = 

As the sets of points at which the corresponding rate functions attain their minimum of 
0, both £pi,pa and are nonempty, compact subsets of X for /5 1 G M T , [3 2 G M a ~ T , and 
u 2 G dom J 2 . The main purpose of this subsection is to record the relationships between 
these sets. 

Before doing so, we point out a concentration property, relative to the set £% , of the 

distributions of Y n with respect to the mixed ensemble Micro(if 2 ; Can(if P n ) an/3 i)" 2 ' r . 
This concentration property is an immediate consequence of the LDP proved in part (b) 
of Theorem |5.1.1| . It justifies calling the set of equilibrium macrostates with respect to 
the mixed ensemble. This concentration property is analogous to those for the canonical 
ensemble and for the microcanonical ensemble given in part (c) of Theorem [2.4| and in 
part (b) of Theorem [3.5| ; the proof is omitted. 



Theorem 5.2.1. We assume Hypotheses 2A and 2.2 . For (3 1 G M T , u 2 G dom J 2 , and 
A any Borel subset of X whose closure A satisfies A D = 0, we have P^ (A) > 0. In 
addition, there exists r G (0, 1) and for all r G (0,r ] there exists C r < oo such that 

mcio{H 2 n] C&n{H l n] P n ) an pf> r {Y n G A] < C r exp[-a n /^(i)/2] ^ as n -> oo. 

As in Theorem |3l], one can also study compactness and weak limit properties of the 
distributions of Y n with respect to Micro(i7 2 ; Can(P^; P n )a„/3 1 ) tt ' r - We shall omit this 
topic. 

We return to the relationships between £pi t p2 and £% . Since for each n 



Can(^,if 2 ;P n ) /3 i i/3 2 and Can(# 2 ; Can(#*; P n )p\ 



2 



are equal, we can derive the relationships between these sets of equilibrium macrostates by 
applying the results of Section 4 to the canonical ensemble and microcanonical ensemble 

C&n{H 2 ;Q n ) anl3 2 and Micro(# 2 ; Q n ) u \ with Q n = Can(#*; P n )a n p- 

To this end, we introduce the relevant thermodynamic functions. With respect to Can(iJ 2 ; 
Can(#i; P n ) an pi) an p2 the free energy is given by 

W (/3 2 ) = ^lim— log / exp[-a n (/5 2 ,i/ 2 )]rf(Can(^;P n ) a , i/3 i) (5.2.3) 
= ini ■{/(*) + {(3\H\x)) + (P 2 ,H 2 (x))} - ^(P 1 ), 
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(f3\ /3 2 ) g ir t x m?- 




(a) For (p\p 2 ) G M T x iR <T - r , any x G ^i,^ lies in some £^ 2 . 

(/^w 2 ) G -K^x doms^i 
u 2 e"C a i 

u 2 g :/:, 





u 2 <£ C fi 



Full Equivalence: ^ Partial Equivalence:^ 

3/? 2 G <9s^(w 2 ) s.t. 



V/3 2 G 0s£(u 2 ) 



Nonequivalence: 

£| 2 n £^1^2 = 
V/? 2 G iR"-" 



(b) For (3 l G iR r , there are three possibilities for u 2 G domsgi. The two branches on the 
left lead to equivalence results, whereas the other branch leads to a nonequivalence 
result. The sets Cpi and Tpi are defined in the last paragraph of Section 5.2. 

Figure 2: Equivalence and nonequivalence of canonical and mixed ensembles. 



where 



lim — log / exp[-a n (/3 1 ,^)] dP n 



(5.2.4) 



M{I{y) + {(3\H\y))}. 



The function <ppi is finite, concave, and continuous on M a T . In ( |5.1.4| ) we identified the 
entropy with respect to Micro(if 2 ; Can(if*; P n )a n f3 1 ) u2 ' r to be 



= - inf{/gi (x) : x G X, H 2 (x) = u 2 } 

= - mf{I(x) + (J3 1 , H\x)) :xeX, H 2 {x) = u 2 } + V \(3 l ) 



(5.2.5) 



u 2 G domsgi if and only if u 2 G dom J 2 . 

As in Section f|, whether or not the entropy spi is concave on M r7 ~ T , its Legendre- 
Fenchel transform equals (pp\. If in addition spi is concave on M cr ~ T , then this formula 
can be inverted to give spi — (p%. 

For (3 l G M T the relationships between Spx^ and E"f x are summarized in Figure 0. 
These relationships depend on two sets that are the analogues of the sets C and T defined 
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in (fO|) and (fO|). For (3 1 G IT we define Cpi to be the set of u 2 G iR ff ~ r for which there 
exists ft 2 G M a ~ T such that 

< s p i{u 2 ) + ((3 2 ,w- u 2 ) for all u> G lR a T . 

We also define Tgi to be the set of u 2 G M a ~ T for which there exists [3 2 G iR°"~ T such that 

Spi (w) < spi (u 2 ) + (p 2 , w — u 2 ) for all w ^ u 2 . 

As in Lemma |4.1| , it can be shown that Cpi = Tpi n domdsV{, where T^i = {u 2 G M a ^ T : 

S p,(u 2 ) = s*; 1 (u 2 )}. 

5.3 Equivalence and Nonequivalence of the Mixed and Micro- 
canonical Ensembles 

In this subsection we study, at the level of equilibrium macrostates, the equivalence 
and nonequivalence of the mixed ensemble Can(if Micro(if 2 ; P n ) u ' r ) a „p 1 and the mi- 
crocanonical ensemble Micro(if*, H 2 ; P n ) ul,u2,r . The parameters u 1 , and u 2 satisfy 
(3 1 G M T , u 2 G dom J 2 , and (u 1 ,?/ 2 ) G dom J, where 

J 2 (u 2 ) = mf{I(x) :xeX, H 2 (x) = u 2 } 

and 

J(u\u 2 ) = inf{/(x) : x G X, H\x) = u\ H 2 (x) = u 2 }. 

For any u l and u 2 , J 2 (u 2 ) < J(u l , u 2 ). Hence, if (u 1 , u 2 ) G dom J, then u 2 G dom J 2 . By a 
similar verification as in the paragraph after Proposition |3.1| , the condition that (it 1 , u 2 ) G 
dom J guarantees that both the mixed ensemble and the microcanonical ensemble are 
well defined for all sufficiently large n. The relationships between the sets of equilibrium 
macrostates for the two ensembles follow immediately from Theorems [4.4| , [4.6| , and (18] 
with minimal changes in proof. Hence we shall only summarize them in Figure |^. 

By Theorem EX!) , for (/3 1 , u 2 ) G lR T x (dom J 2 ), with respect to Can(#*; Micro(# 2 ; P n ) 
Y n satisfies the LDP with rate function 

4{x) = I({x} n (H 2 )- 1 ^ 2 })) + ((3\H\x))-^l (5.3.1) 

where 

i>f = M{I(y) + (P\H\y)) : y G X,H 2 (y) = u 2 }. 

In addition, for (u l ,u 2 ) G dom J, with respect to Micro(if*, H 2 ; P n ) ul,u2 ' r Y n satisfies the 
LDP with rate function 

I**\x) = i({x} n (HY'du 1 }) n {H 2 Y\{u 2 })) - J(u\ u 2 ). 

For (3 1 G M T , u 2 G dom J 2 , and (w^w 2 ) G dom J, we define the corresponding sets of 
equilibrium macrostates 

£f = {xe X:lf 1 (x) = 0} 

= {xeX: H 2 {x) = it 2 , 1{x) + (P 1 , H\x)) = i/>£} 
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and 



£ ul ^ 2 = { x G X : I uL ' u \x) = 0} 

= {x G X : I(x) = J{u\u 2 ), H\x) = u 1 , H 2 (x) = u 2 }. 

As the sets of points at which the corresponding rate functions attain their minimum of 
0, the set Sj^, for (3 1 G FT and u 2 G dom J 2 , and the set £ ul ^ 2 ^ for {u l ,u 2 ) G dom J, 
are nonempty and compact. The purpose of this subsection is to record the relationships 
between these sets. 

Since for (u 1 , u 2 ) G dom J and each n 

Micro{HlH 2 -P n ) ul ' u2 ' r and Micro(#*; Micro(if 2 ; P n ) u2 ' r ) uV 

2 12 

are equal, we can derive the relationships between 8%. and £ u ,u by applying the results 
of Section 4 to the canonical ensemble and microcanonical ensemble 

Can(Hl;Q n ) anf3 2 and Microti Q n )" V , with Q n = Micro(# 2 ; P n f' r . 

To this end, we introduce the relevant thermodynamic functions. By Theorem |3]^, for 
u 2 G dom J 2 the rate function in the LDP for Y n with respect to Micro(if 2 ; P n ) u ' r is 

r\x) = i({x}n(H 2 r 1 ({u 2 }))-j 2 (u 2 ). 

Hence by the Laplace principle, for u 2 G dom J 2 the free energy with respect to the 
ensemble Can(if*; Micro (if 2 ; P n ) u ' r )a n /3 1 is given by 

^\(3 l ) = -\im -log / e?t V [-a n (l3\Hl)]d(Micxo{H 2 n] P n ) u2 > r 

= mfU u \x) + ((3\H\x))} (5.3.2) 

= mi{I(x) + (/3 l , H\x)) : x G X, H 2 (x) = u 2 } - J 2 {u 2 ). 

The function <p u2 is finite, concave, and continuous on M T . For u 2 G dom J 2 we define 

^{u 1 ) = mf{I u \x) : x G X,H 1 (x) = u 1 } 

= M{I(x) :xeX } H\x) = u\ H 2 (x) = u 2 } - J 2 {u 2 ) (5.3.3) 
= J{u\u 2 )- J 2 (u 2 ). 

With respect to Micro(if Micro(if 2 ; P n ) u2,r ) ul > r , for u 2 G dom J 2 the entropy is given by 

s u2 (u l ) = -J"V)- (5.3.4) 

We have u 1 G doms" 2 if and only if (li 1 ,?/ 2 ) G dom J. 

As in Section |], whether or not s u is concave on FT, its Legendre-Fenchel transform 
(s u )* equals <p u . If s u is concave on M T , then this formula can be inverted to give 
s u2 = (ip u2 )* for all u 1 G R T . 
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w , f3 j G dom J 2 x M T 



(a) For (u ,j3 ) G dom J 2 x iFT, any x G £g X lies in some S u ,u . 

(w 2 ,?/ 1 ) G dom J 2 x doms" 2 

u 1 G T" 2 ti 1 4 T" 2 



Full Equivalence: 

3/3 1 G dfs" 2 )**^ 1 ) s.t. 

19 9 

G — 



f Partial Equivalence:^ 

^ V/3 1 G d(s" 2 )"V) J 



M 1 £ C u2 



Nonequivalence: 

£ ul ' u2 n = 
V/5 1 G iR T 



(b) For u 2 G dom J 2 , there are three possibilities for u 1 G doms" 2 . The two branches on 
the left lead to equivalence results, whereas the other branch leads to a nonequiv- 

2 2 

alence result. The sets C u and T u are defined in the next to last paragraph of 
Section 5.3. 



Figure 3: Equivalence and nonequivalence of mixed and microcanonical ensembles. 



For u 2 G dom J 2 the relationships between &%_ and S u ' u are summarized in Figure 
|3]. These relationships depend on two sets that are the analogues of the sets C and T 
defined in fl4?7|) and Q. For (3 1 G R T we define C" 2 to be the set of u 1 G R T for which 
there exists f3 l G M T such that 

s u \w) < s u2 (u l ) + ((3\w- u l ) for all w G M T . 
We also define T u2 to be the set of u 1 G M T for which there exists (3 1 G M T such that 

s u \w) < s^iu 1 ) + (P l ,w- u 1 ) for all w ^ u 1 . 
As in Lemma |4~l"l , it can be shown that C u2 = T" 2 fl domd(s u2 )** , where T" 2 = {u 1 G 

mr : s u2 (u l ) = (s" 2 )**^ 1 )}. 

With Figure [3], we complete our presentation of the equivalence and nonequivalence 
results for the mixed ensemble, the canonical ensemble, and the microcanonical ensemble. 
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